Welcome!

class: center, middle, inverse, title-slide

# Welcome!
## An overview of the course & intro to data types
### Daniel Anderson
### Week 1

---

layout: true

</span>
  </div>

---

# Agenda

.pull-left[

* Getting on the same page

* Syllabus

* Intro to data types

]

.pull-right[

.right[

]

---
background-image: url(https://images.pexels.com/photos/1005324/literature-book-open-pages-1005324.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=650&w=940)
class: inverse bottom
background-size:cover

# Getting on the same page

---
# Introduce yourself!

* We mostly know each other, but it's always good to hear from each other.

* Tell us why you're taking the class

* What's one fun thing you've done recently outside of school stuff?

* What pronouns would you like us to use for you in this class?

---
# This term for me

* Full-time employed with [Abl](https://www.ablschools.com/) starting next week.

* This is my only UO responsibilities

* Doesn't mean I'm less committed to you

* Does mean time is going to be tricky

+ Friday afternoons will be best for meetings - can do evenings too if needed.

---
background-image: url(https://akm-img-a-in.tosshub.com/indiatoday/images/story/201802/Syllabus-featured.jpeg?rpmIvNBBqGMDT7lPG.Qe4AzWs6OfDh9K)
class: inverse bottom
background-size:cover

# Syllabus

---
# Course Website(s)

![](img/course-website.png)

---
# Course learning objectives

* Understand and be able to describe the differences in R's data structures and when each is most appropriate for a given task

--
* Explore `purrr::map` and its variants, how they relate to base R functions, and why the {purrr} variants are often preferable.

--
* Work with lists and list columns using `purrr::nest` and `purrr:unnest`

--
.gray[

* Understand how `dplyr::rowwise()` can help you avoid some of the above

]

---
# Course learning objectives

* Convert repetitive tasks into functions

--
* Understand elements of good functions, and things to avoid

--
* Write effective and clear functions with the mantra of "Don't Repeat Yourself"

---
# This Week's learning objectives

* Understand the requirements of the course

* Understand the requirements of the final project

* Understand the fundamental difference between lists and atomic vectors

--
* Understand how atomic vectors are coerced, implicitly or explicitly

--
* Understand various ways to subset vectors, and how subsetting differs for lists

--
* Understand attributes and how to set/modify

---
class: inverse-blue middle
# Textbooks

---
background-image:url(https://d33wubrfki0l68.cloudfront.net/565916198b0be51bf88b36f94b80c7ea67cafe7c/7f70b/cover.png)
background-size:contain
class: middle

.right[
[Link](https://adv-r.hadley.nz)
]

---
# Other books (also free)

.pull-left[
[Bryan](http://happygitwithr.com)
<div>
<img src = https://happygitwithr.com/img/watch-me-diff-watch-me-rebase-smaller.png height = 400> </div>
]

.pull-right[
[Wickham & Grolemund](https://r4ds.had.co.nz)
<div>
<img src =https://d33wubrfki0l68.cloudfront.net/b88ef926a004b0fce72b2526b0b5c4413666a4cb/24a30/cover.png height = 400> </div>

]

---
# Structure of the course

* First 5 weeks - mostly iteration

+ Data types
  + Base R iterations
  + {purrr}
  + Batch processes and working with list columns
  + Parallel iterations (and a few extras)

--
* Second  5 weeks - Writing functions and shiny
  + Writing functions 1-3
  + Shiny 1-3

---
# Labs
### 15%
### 3 @ 10 points each

Two labs on iteration and one on functions.

If possible - please try to be in-class on Lab days. I understand this is not always possible, but it helps me help you.

| Lab|Date Assigned |Date Due      |Topic                                          |
|---:|:-------------|:-------------|:----------------------------------------------|
|   1|Mon, April 04 |Mon, April 11 |Subsetting lists and base R `for()` loops      |
|   2|Mon, April 11 |Mon, April 18 |Multiple models and API calls with **{purrr}** |
|   3|Mon, May 09   |Mon, May 16   |Create and apply functions                     |

---
# Midterm
### 70 points total (35%)
Two parts: 
* Small quiz on canvas to demonstrate knowledge (4/22; 10 points)
  + Identifying bugs in code
  + Multiple choice/fill-in the blank questions
  + Free response
  
* Take-home portion to demonstrate ability to write the correct code (assigned 4/22; 60 points)
  + Write loops to solve problems

---
# Take home midterm

* Group project: 3-5 people

* Shared GitHub repo

* Divide it up to ease the workload, then just check each other's work

Already <a href="../take-home-midterm" target="_blank">posted</a>!

---
# Final Project 
### 100 points total (60%)

5 parts

|  **Component** |**Due**  | **Points** | **Percentage of final grade**  |
|  :---------------- | :------  |:--:|:--:|
|  Groups finalized  | 04-04-22 | 0  | 0  |
|  [Outline](https://fp-2022.netlify.app/assignments/#outline) | 04-18-22 | 5  | 2.5|
|  [Draft data script](https://fp-2022.netlify.app/assignments/#draft-data-preparation-script) | 05-16-22 | 10 | 5  |
|  [Peer Review](https://fp-2022.netlify.app/assignments/#peer-review) | 05-23-22 | 15 | 7.5|
|  [Final Product](https://fp-2022.netlify.app/assignments/#final-product) | 06-06-22 | 70 | 35 |
---
class: inverse-red middle
# What is it?
## Two basic options

---
# Data Product 
Similar to first class

* Brief research manuscript (can be APA or not, I don't really care 🤷‍♂️)

* Shiny app

* Dashboard
  + Probably unlikely to work well though, unless you make it a shiny dashboard

* Blog post

* For the ambitious - a documented R package

---
# Tutorial
* Probably best done through a blog post or series of blog posts

* Approach as if you're teaching others about the content I'll ask you to cover

* BONUS: You can actually release the blog post(s) and may get some traffic 🎉👏🥳

---
background-image:url(https://images.pexels.com/photos/1252869/pexels-photo-1252869.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260)
background-size:cover
class:inverse center middle

# Make it your own

---
# What you have to have
* Everything on GitHub

* Publicly available dataset

* Team of 2-5

---
# What you have to cover
Unfortunately, this still is a class assignment. I have to be able to evaluate
that you can actually apply the content within a messy, real-world setting.

--
The grading criteria (which follow) may force you into some use cases that are 
a bit artificial. This is okay.

---
# Grading criteria

* No code is used repetitively (no more than twice) .gray[10 points]

--
* More than one variant of `purrr::map` is used .gray[5 points]

--
* At least one {purrr} function outside the basic `map` family (`walk_*`, 
`reduce`, `modify_*`, etc.) .gray[5 points]

--
* At least one instance of parallel iteration (e.g., `map2_*`, `pmap_*`) .gray[5 points]

--
* At least one use case of `purrr::nest %>% mutate()` .gray[5 points]

---
# Grading criteria

* At least two custom functions .gray[20 points; 10 points each]

+ Each function must do exactly one thing
  + The functions **may** replicate the behavior of a base function - as noted above this is about practicing the skills you learn in class

--
* Code is fully reproducible and housed on GitHub .gray[10 points]

--
* No obvious errors in chosen output format .gray[5 points]

--
* Deployed on the web and shareable through a link .gray[5 points]

---
# Outline
### Due 4/18/21

Four components:
* Description of data source (must be publicly available)

* Purpose (tutorial or substantive)

* Chosen format

* Lingering questions
  + How can I help?

--
Please include all components - including the question(s) section!

---
# Draft
### Due 5/16/21, before class
* Expected to still be a work in progress

+ This means some of your code may be rough and/or incomplete. However:

* Direction should be obvious

* Most, if not all, grading elements should be present

* Provided to your peers so they can learn from you as much as you can learn 
  from their feedback

---
# Peer Review
* Exact same process we've used before

* If, during your peer review, you find grading elements not present,
definitely note them

---
# Utilizing GitHub (required)
* You'll be assigned two groups to review

* Fork their repo

* Embed comments, suggest changes to their code
  + Please do both of these

* Submit a PR
  
  + Summarize your overall review in the PR

---
# Grading
200 points total

* 3 labs at 10 points each (30 points; 15%)

* Midterm in-class (10 points; 5%)

* Midterm take-home (60 points; 30%)

* Final Project (100 points; 50%)

+ Outline (5 points; 2.5%)

+ Draft (10 points; 5%)

+ Peer review (15 points; 7.5%)

+ Product (70 points; 35%)

---
# Grading

|  **Lower percent** |**Lower point range**  | **Grade** | **Upper point range**  | **Upper percent**|
|  :-------- | :-----------------| :--| ----------------: | -----:|
|  0.970+    | (194 pts or more) | A+ |                   |       |
|  0.930     | (186 pts)         | A  | (193 pts)         | 0.969 |
|  0.900     | (180 pts)         | A- | (185 pts)         | 0.929 |
|  0.870     | (174 pts)         | B+ | (179 pts)         | 0.899 |
|  0.830     | (166 pts)         | B  | (173 pts)         | 0.869 |
|  0.800     | (160 pts)         | B- | (165 pts)         | 0.829 |
|  0.770     | (154 pts)         | C+ | (159 pts)         | 0.799 |
|  0.730     | (146 pts)         | C  | (153 pts)         | 0.769 |
|  0.700     | (140 pts)         | C- | (145 pts)         | 0.739 |
|            |                   | F  | (139 pts or less) | 0.699 |

---
# A note on feedback

* Last term, I gave you feedback on everything you submitted

* This term, I will give you feedback on the midterm and the final only

* Labs scored on a completion basis

* We will go over everything in class

---
class: middle
# Week 6
## Fully remote
I will be traveling and plan to teach from my hotel room.

---
class: inverse
background-image:url(https://d194ip2226q57d.cloudfront.net/original_images/10_Tips_for_Workplace_Communication)
background-size:contain

---
class: inverse-blue center middle
# Break

---
class: inverse-red center middle
# Basic data types

---
# Vectors
### Pop quiz
Discuss in small breakout groups
* What are the four basic types of atomic vectors?

* What function creates a vector?

* **T**/**F**: A list (an R list) is not a vector.

* What is the fundamental difference between a matrix and a data frame?

* What does *coercion* mean, and when does it come into play?

---
# Vector types
### 4 basic types
Note there are two others (complex and raw), but we almost never care about them.

* Integer

* Double

* Logical

* Character

--
Integer and double vectors are both numeric.

---
# Creating vectors

Vectors are created with `c`. Below are examples of each of the four main types
of vectors.

```r
# L explicitly an integer, not double
integer <- c(5L, 7L, 3L, 94L) 
double <- c(3.27, 8.41, Inf, -Inf)
logical <- c(TRUE, TRUE, TRUE, FALSE, TRUE, FALSE, FALSE)
character <- c("red", "orange", "yellow", "green", "blue", 
               "violet", "rainbow")
```

---
# Coercion
* Vectors **must** be of the same type.

* If you try to mix types, implicit coercion will occur

* Implicit coercion defaults to the most flexible type
  + which is... ?

.pull-left[

```r
c(7L, 3.25)
```

```
## [1] 7.00 3.25
```

```r
c(3.24, TRUE, "April")
```

```
## [1] "3.24"  "TRUE"  "April"
```
]

.pull-right[

```r
c(TRUE, 5)
```

```
## [1] 1 5
```
]

---
# Explicit coercion
* You can alternatively define the coercion to occur

```r
as.integer(c(7L, 3.25))
```

```
## [1] 7 3
```

```r
as.logical(c(3.24, TRUE, "April"))
```

```
## [1]   NA TRUE   NA
```

```r
as.character(c(TRUE, 5)) # still maybe a bit unexpected?
```

```
## [1] "1" "5"
```

---
# Checking types

* Use `typeof` to verify the type of vector

```r
typeof(c(7L, 3.25))
```

```
## [1] "double"
```

```r
typeof(as.integer(c(7L, 3.25)))
```

```
## [1] "integer"
```

---
# Coercing to logical

```r
as.logical(c(0, 1, 1, 0))
```

```
## [1] FALSE  TRUE  TRUE FALSE
```

--
Any number that is not zero gets coerced to `FALSE`

```r
as.logical(c(0, 5L, 7.4, -1.6, 0))
```

```
## [1] FALSE  TRUE  TRUE  TRUE FALSE
```

--
Wait... why the `NA` here?

```r
as.logical(c(3.24, TRUE, "April"))
```

```
## [1]   NA TRUE   NA
```

---
# Piping
* Although traditionally used within the tidyverse (not what we're doing here),
it can still be useful. The following are equivalent

```r
library(magrittr)

typeof(as.integer(c(7L, 3.25)))
```

```
## [1] "integer"
```

```r
c(7L, 3.25) %>%
  as.integer() %>%
  typeof()
```

```
## [1] "integer"
```

---
# Don't actually need magrittr

## Base pipe! |>

This is a bigger topic than I want to get into here, but for reference...

```r
c(7L, 3.25) |>
  as.integer() |>
  typeof()
```

```
## [1] "integer"
```

---
# Pop quiz
Without actually running the code, predict which type each of
the following will coerce to.

```r
c(1.25, TRUE, 4L)
c(1L, FALSE)
c(7L, 6.23, "eight")
c(TRUE, 1L, 0L, "False")
```

<div class="countdown" id="timer_6241dd66" style="right:0;bottom:0;" data-warnwhen="0">
<code class="countdown-time"><span class="countdown-digits minutes">01</span><span class="countdown-digits colon">:</span><span class="countdown-digits seconds">00</span></code>
</div>
---
# Answers

```r
typeof(c(1.25, TRUE, 4L))
```

```
## [1] "double"
```

```r
typeof(c(1L, FALSE))
```

```
## [1] "integer"
```

```r
typeof(c(7L, 6.23, "eight"))
```

```
## [1] "character"
```

```r
typeof(c(TRUE, 1L, 0L, "False"))
```

```
## [1] "character"
```

---
# Lists
* Lists are vectors, but not *atomic* vectors

* Fundamental difference - each element can be a different type

```r
list("a", 7L, 3.25, TRUE)
```

```
## [[1]]
## [1] "a"
## 
## [[2]]
## [1] 7
## 
## [[3]]
## [1] 3.25
## 
## [[4]]
## [1] TRUE
```

---
# Lists

.pull-left[

* Each element of the list is another vector, possibly atomic, possibly not

* The prior example included all *scalar* vectors

* Lists do not require all elements to be the same length

]

.pull-right[

```r
list(
  c("a", "b", "c"),
  rnorm(5),
  c(7L, 2L),
  c(TRUE, TRUE, FALSE, TRUE)
)
```

```
## [[1]]
## [1] "a" "b" "c"
## 
## [[2]]
## [1] -0.2693514 -0.3909654  1.3487070
## [4] -0.0227647  0.2442259
## 
## [[3]]
## [1] 7 2
## 
## [[4]]
## [1]  TRUE  TRUE FALSE  TRUE
```
]

---
# Summary
* Atomic vectors must all be the same type

+ implicit coercion occurs if not (and you haven't specified the coercion explicitly)

* Lists are also vectors, but not atomic vectors
  
  + Each element can be of a different type and length

+ Incredibly flexible, but often a little more difficult to get the hang of

---
# Challenge

### Work with a partner
One of you share your screen: 
* Create four atomic vectors, one for each of the fundamental types

* Combine two or more of the vectors. Predict the implicit coercion of each.

* Apply explicit coercions, and predict the output for each.

(basically quiz each other)

---
class: inverse-blue middle
# Attributes

---
# Attributes
* What are attributes?

--
  + metadata... what's metadata?

--
  + Data about the data

---
# Other data types

Atomic vectors by themselves make up only a small fraction of the total number of data types in R

--
### What are some other data types?

--
* Data frames

--
* Matrices & arrays

--
* Factors

--
* Dates

--
Remember, atomic vectors are the atoms of R. Many other data structures are
built from atomic vectors.

* We use attributes to create other data types from atomic vectors

---
# Attributes

### Common
* Names
* Dimensions

### Less common
* Arbitrary metadata

---
# Examples
### Please follow along!
* See **all** attributes associated with a give object with `attributes`

```r
library(palmerpenguins)
attributes(penguins[1:50, ]) # limiting rows just for slides
```

```
## $names
## [1] "species"           "island"           
## [3] "bill_length_mm"    "bill_depth_mm"    
## [5] "flipper_length_mm" "body_mass_g"      
## [7] "sex"               "year"             
## 
## $row.names
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14
## [15] 15 16 17 18 19 20 21 22 23 24 25 26 27 28
## [29] 29 30 31 32 33 34 35 36 37 38 39 40 41 42
## [43] 43 44 45 46 47 48 49 50
## 
## $class
## [1] "tbl_df"     "tbl"        "data.frame"
```

---

```r
head(penguins)
```

```
## # A tibble: 6 × 8
##   species island    bill_length_mm
##   <fct>   <fct>              <dbl>
## 1 Big one Torgersen           39.1
## 2 Big one Torgersen           39.5
## 3 Big one Torgersen           40.3
## 4 Big one Torgersen           NA  
## 5 Big one Torgersen           36.7
## 6 Big one Torgersen           39.3
## # … with 5 more variables:
## #   bill_depth_mm <dbl>,
## #   flipper_length_mm <int>,
## #   body_mass_g <int>, sex <fct>, year <int>
```

---
# Get specific attribute
* Access just a single attribute by naming it within `attr`

```r
attr(penguins, "class")
```

```
## [1] "tbl_df"     "tbl"        "data.frame"
```

```r
attr(penguins, "names")
```

```
## [1] "species"           "island"           
## [3] "bill_length_mm"    "bill_depth_mm"    
## [5] "flipper_length_mm" "body_mass_g"      
## [7] "sex"               "year"
```

--
Note - this is not generally how you would pull these attributes. Rather, you would use `class()` and `names()`.

---
# Be specific
* Note in the prior slides, I'm asking for attributes on the entire data frame.

* Is that what I want?... maybe. But the individual vectors may have attributes as well

```r
attributes(penguins$species)
```

```
## $levels
## [1] "Big one"    "Little one" "Funny one" 
## 
## $class
## [1] "factor"
```

```r
attributes(penguins$bill_length_mm)
```

```
## NULL
```

---
# Set attributes
* Just redefine them within `attr`

```r
attr(penguins$species, "levels") <- c("Big one", 
                                      "Little one", 
                                      "Funny one")

head(penguins)
```

Note - you would generally not define levels this way either, but it is a general method for modifying attributes.

---
# Dimensions

* Let's create a matrix (please do it with me)

```r
m <- matrix(1:6, ncol = 2)
m
```

```
##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## [3,]    3    6
```

* Notice how the matrix fills

--
* Check out the attributes

```r
attributes(m)
```

```
## $dim
## [1] 3 2
```

---
# Modify the attributes
* Let's change it to a 2 x 3 matrix, instead of 3 x 2 (you try first)

```r
attr(m, "dim") <- c(2, 3)
m
```

```
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
```

--
* is this the result you expected?

---
# Alternative creation
* Create an atomic vector, assign a dimension attribute

```r
v <- 1:6
v
```

```
## [1] 1 2 3 4 5 6
```

```r
attr(v, "dim") <- c(3, 2)
v
```

```
##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## [3,]    3    6
```

---
# Aside
* What if we wanted it to fill by row?

.pull-left[

```r
matrix(6:13, 
       ncol = 2, 
*      byrow = TRUE)
```

```
##      [,1] [,2]
## [1,]    6    7
## [2,]    8    9
## [3,]   10   11
## [4,]   12   13
```

```r
vect <- 6:13
dim(vect) <- c(2, 4)
vect
```

```
##      [,1] [,2] [,3] [,4]
## [1,]    6    8   10   12
## [2,]    7    9   11   13
```
]

.pull-right[

```r
t(vect)
```

```
##      [,1] [,2]
## [1,]    6    7
## [2,]    8    9
## [3,]   10   11
## [4,]   12   13
```
]

---
# Names
* The following (this slide and the next) are equivalent

```r
dim_names <- list(
    c("the first", "second", "III"),
    c("index", "value")
  )

attr(v, "dimnames") <- dim_names
  
v
```

```
##           index value
## the first     1     4
## second        2     5
## III           3     6
```

---
# Names

```r
v2 <- 1:6
attr(v2, "dim") <- c(3, 2)
rownames(v2) <- c("the first", "second", "III")
colnames(v2) <- c("index", "value")
v2
```

```
##           index value
## the first     1     4
## second        2     5
## III           3     6
```

---
# Arbitrary metadata

```r
attr(v, "matrix_mean") <- mean(v)
v
```

```
##           index value
## the first     1     4
## second        2     5
## III           3     6
## attr(,"matrix_mean")
## [1] 3.5
```

```r
attr(v, "matrix_mean")
```

```
## [1] 3.5
```

* Note that *anything* can be stored as an attribute (including matrices or data frames, etc.)

---
# Why would we do this?

This is a short example that is based on a real example

* Imagine we're accessing a database that has many years of data

* The tables in the database are the same, but the values (of course) differ

* We might want to return the data, but store the year as an attribute

--
This example is overly complicated for the first day, but it will give you an idea of what we're building toward.

---
# Made up db
I'm using a list to mimic a database

```r
db <- list(
  data.frame(
    color = c("red", "orange", "green"),
    transparency = c(0.80, 0.65, 0.93)
  ),
  data.frame(
    color = c("blue", "pink", "cyan"),
    transparency = c(0.40, 0.35, 0.87)
  )
)
```

---

```r
db
```

```
## [[1]]
##    color transparency
## 1    red         0.80
## 2 orange         0.65
## 3  green         0.93
## 
## [[2]]
##   color transparency
## 1  blue         0.40
## 2  pink         0.35
## 3  cyan         0.87
```

---
# Write a function
Let's write a function that grabs one of these tables. If it's "1920" we'll grab the first one, otherwise we'll grab the second one.

```r
pull_color_data <- function(year) {
  to_pull <- if(year == "1920") {
    out <- db[[1]]
  } else {
    out <- db[[2]]
  }
  out
}
```

---
# Does it work?

.pull-left[

```r
pull_color_data(1920)
```

```
##    color transparency
## 1    red         0.80
## 2 orange         0.65
## 3  green         0.93
```

```r
pull_color_data(2021)
```

```
##   color transparency
## 1  blue         0.40
## 2  pink         0.35
## 3  cyan         0.87
```

```r
pull_color_data(2122)
```

```
##   color transparency
## 1  blue         0.40
## 2  pink         0.35
## 3  cyan         0.87
```

]

.pull-right[
## Yes!
]
---
# Build a second function

Let's say we want to make a second function that does something with the previous output.

--
## BUT

--
What we do with it is going to depend on the data frame we get back.

--
We need to know the year.

--
Modify our original function to store the year as an attribute!

---
Notice we redefine the attributes so we're including all the prior attributes it already had.

```r
pull_color_data <- function(year) {
  to_pull <- if(year == "1920") {
    out <- db[[1]]
  } else {
    out <- db[[2]]
  }
* attributes(out) <- c(
*   attributes(out),
*   db = year
* )
  out
}
```

---
# Try

```r
pull_color_data(1920)
```

```
##    color transparency
## 1    red         0.80
## 2 orange         0.65
## 3  green         0.93
```

```r
attr(pull_color_data(2021), "db")
```

```
## [1] 2021
```

---
# Build our second function

Now, we can make our second function, and have it do something different depending on the data that is passed to it.

```r
print_colors <- function(color_data) {
  title <- paste0("Colors for ", attr(color_data, "db")) 
  colorspace::swatchplot(color_data$color)
  mtext(title) # base plotting function
}
```

---

```r
pull_color_data(1920) %>% 
  print_colors()
```

![](w1_files/figure-html/unnamed-chunk-19-1.png)

---

```r
pull_color_data(2021) %>% 
  print_colors()
```

![](w1_files/figure-html/unnamed-chunk-20-1.png)

---
# Another example
Fit a multilevel model and pull the variance-covariance matrix

```r
m <- lme4::lmer(Reaction ~ 1 + Days + (1 + Days|Subject), 
                data = lme4::sleepstudy)

lme4::VarCorr(m)$Subject
```

```
##             (Intercept)      Days
## (Intercept)  612.100158  9.604409
## Days           9.604409 35.071714
## attr(,"stddev")
## (Intercept)        Days 
##   24.740658    5.922138 
## attr(,"correlation")
##             (Intercept)       Days
## (Intercept)  1.00000000 0.06555124
## Days         0.06555124 1.00000000
```

---
# Matrices vs Data frames

Usually we want to work with data frames because they represent our data better.

Sometimes a matrix is more efficient because you can operate on the **entire** matrix at once.

```r
set.seed(42)
m <- matrix(rnorm(100, 200, 10), ncol = 10)
m
```

```
##           [,1]     [,2]     [,3]     [,4]
##  [1,] 213.7096 213.0487 196.9336 204.5545
##  [2,] 194.3530 222.8665 182.1869 207.0484
##  [3,] 203.6313 186.1114 198.2808 210.3510
##  [4,] 206.3286 197.2121 212.1467 193.9107
##  [5,] 204.0427 198.6668 218.9519 205.0496
##  [6,] 198.9388 206.3595 195.6953 182.8299
##  [7,] 215.1152 197.1575 197.4273 192.1554
##  [8,] 199.0534 173.4354 182.3684 191.4909
##  [9,] 220.1842 175.5953 204.6010 175.8579
## [10,] 199.3729 213.2011 193.6001 200.3612
##           [,5]     [,6]     [,7]     [,8]
##  [1,] 202.0600 203.2193 196.3277 189.5688
##  [2,] 196.3894 192.1616 201.8523 199.0981
##  [3,] 207.5816 215.7573 205.8182 206.2352
##  [4,] 192.7330 206.4290 213.9974 190.4648
##  [5,] 186.3172 200.8976 192.7271 194.5717
##  [6,] 204.3282 202.7655 213.0254 205.8100
##  [7,] 191.8861 206.7929 203.3585 207.6818
##  [8,] 214.4410 200.8983 210.3851 204.6377
##  [9,] 195.6855 170.0691 209.2073 191.1422
## [10,] 206.5565 202.8488 207.2088 189.0022
##           [,9]    [,10]
##  [1,] 215.1271 213.9212
##  [2,] 202.5792 195.2383
##  [3,] 200.8844 206.5035
##  [4,] 198.7910 213.9111
##  [5,] 188.0567 188.8921
##  [6,] 206.1200 191.3921
##  [7,] 197.8286 188.6826
##  [8,] 198.1724 185.4079
##  [9,] 209.3335 200.7998
## [10,] 208.2177 206.5320
```

---

```r
sum(m)
```

```
## [1] 20032.51
```

```r
mean(m)
```

```
## [1] 200.3251
```

```r
rowSums(m)
```

```
##  [1] 2048.470 1993.774 2041.155 2025.924
##  [5] 1978.173 2007.265 1998.086 1960.291
##  [9] 1952.476 2026.901
```

```r
colSums(m)
```

```
##  [1] 2054.730 1983.654 1982.192 1963.610
##  [5] 1997.978 2001.839 2053.908 1978.212
##  [9] 2025.111 1991.281
```

```r
# standardize the matrix
z <- (m - mean(m)) / sd(m)
```

---

```r
z
```

```
##              [,1]       [,2]       [,3]
##  [1,]  1.28528802  1.2218239 -0.3256841
##  [2,] -0.57349498  2.1646089 -1.7417882
##  [3,]  0.31748345 -1.3649263 -0.1963133
##  [4,]  0.57650528 -0.2989403  1.1352110
##  [5,]  0.35698951 -0.1592501  1.7887033
##  [6,] -0.13313334  0.5794704 -0.4445968
##  [7,]  1.42026916 -0.3041875 -0.2782756
##  [8,] -0.12212321 -2.5821792 -1.7243635
##  [9,]  1.90703954 -2.3747685  0.4106013
## [10,] -0.09144695  1.2364622 -0.6458013
##              [,4]       [,5]        [,6]
##  [1,]  0.40613865  0.1665940  0.27791666
##  [2,]  0.64562157 -0.3779416 -0.78393268
##  [3,]  0.96277141  0.6968297  1.48192480
##  [4,] -0.61596668 -0.7290676  0.58614338
##  [5,]  0.45367758 -1.3451640  0.05497234
##  [6,] -1.68004206  0.3844054  0.23434417
##  [7,] -0.78452812 -0.8103926  0.62108770
##  [8,] -0.84833774  1.3555260  0.05504171
##  [9,] -2.34955213 -0.4455350 -2.90544454
## [10,]  0.00346451  0.5983857  0.24234547
##             [,7]       [,8]        [,9]
##  [1,] -0.3838736 -1.0329155  1.42140711
##  [2,]  0.1466507 -0.1178282  0.21645471
##  [3,]  0.5274934  0.5675319  0.05370436
##  [4,]  1.3129235 -0.9468782 -0.14731870
##  [5,] -0.7296315 -0.5524942 -1.17812024
##  [6,]  1.2195893  0.5266990  0.55646825
##  [7,]  0.2912866  0.7064474 -0.23973975
##  [8,]  0.9660389  0.4141258 -0.20672212
##  [9,]  0.8529388 -0.8818216  0.86505545
## [10,]  0.6610253 -1.0873272  0.75791330
##             [,10]
##  [1,]  1.30560568
##  [2,] -0.48848642
##  [3,]  0.59329679
##  [4,]  1.30463971
##  [5,] -1.09789797
##  [6,] -0.85783015
##  [7,] -1.11801576
##  [8,] -1.43248556
##  [9,]  0.04558258
## [10,]  0.59603916
```

---
# Stripping attributes
* Many operations will strip attributes (which makes storing important things in them a bit precarious)

.pull-left[

```r
v
```

```
##           index value
## the first     1     4
## second        2     5
## III           3     6
## attr(,"matrix_mean")
## [1] 3.5
```

```r
rowSums(v)
```

```
## the first    second       III 
##         5         7         9
```
]

--
.pull-right[

```r
attributes(rowSums(v))
```

```
## $names
## [1] "the first" "second"    "III"
```

* Generally `names` are maintained

* Sometimes, `dim` is maintained, sometimes not

* All else is stripped

]

---
# More on `names`

* The `names` attribute corresponds to the individual elements within a vector

```r
names(v)
```

```
## NULL
```

```r
names(v) <- letters[1:6]
v
```

```
##           index value
## the first     1     4
## second        2     5
## III           3     6
## attr(,"matrix_mean")
## [1] 3.5
## attr(,"names")
## [1] "a" "b" "c" "d" "e" "f"
```

---
* Perhaps more straightforward

```r
v3a <- c(a = 5, b = 7, c = 12)
v3a
```

```
##  a  b  c 
##  5  7 12
```

```r
names(v3a)
```

```
## [1] "a" "b" "c"
```

```r
attributes(v3a)
```

```
## $names
## [1] "a" "b" "c"
```

---
# Alternatives

```r
v3b <- c(5, 7, 12)
names(v3b) <- c("a", "b", "c")
v3b
```

```
##  a  b  c 
##  5  7 12
```

```r
v3c <- setNames(c(5, 7, 12), c("a", "b", "c"))
v3c
```

```
##  a  b  c 
##  5  7 12
```

--
* Note that `names` is **not** the same thing as `colnames`, but, somewhat confusingly, both work to rename the variables (columns) of a data frame. We'll talk more about why this is momentarily.

---
# Why names might be helpful

```r
v
```

```
##           index value
## the first     1     4
## second        2     5
## III           3     6
## attr(,"matrix_mean")
## [1] 3.5
## attr(,"names")
## [1] "a" "b" "c" "d" "e" "f"
```

.pull-left[

```r
v["b"]
```

```
## b 
## 2
```
]

.pull-right[

```r
v["e"]
```

```
## e 
## 5
```
]

---
# Implementation of factors

```r
fct <- factor(c("a", "a", "b", "c"))
typeof(fct)
```

```
## [1] "integer"
```

```r
attributes(fct)
```

```
## $levels
## [1] "a" "b" "c"
## 
## $class
## [1] "factor"
```

```r
str(fct)
```

```
##  Factor w/ 3 levels "a","b","c": 1 1 2 3
```

---
# More manually

```r
# First create integer vector
int <- c(1L, 1L, 2L, 3L, 1L, 3L)

# assign some levels
attr(int, "levels") <- c("red", "green", "blue")

# change the class to a factor
class(int) <- "factor"

int
```

```
## [1] red   red   green blue  red   blue 
## Levels: red green blue
```

---
# This can make things tricky

```r
age <- factor(sample(c("baby", 1:10), 100, replace = TRUE))
str(age)
```

```
##  Factor w/ 11 levels "1","10","2","3",..: 8 1 9 2 1 1 2 10 9 6 ...
```

```r
age
```

```
##   [1] 7    1    8    10   1    1    10   9   
##   [9] 8    5    4    9    baby baby 4    2   
##  [17] 9    baby 4    2    5    9    10   3   
##  [25] 9    baby 10   10   3    3    3    10  
##  [33] baby 8    baby baby 3    baby baby baby
##  [41] 1    7    2    4    baby 2    4    7   
##  [49] 5    6    9    6    8    8    8    2   
##  [57] 1    4    8    3    8    5    7    baby
##  [65] 5    10   8    5    9    5    5    3   
##  [73] 6    5    9    4    10   4    1    8   
##  [81] 5    baby baby 10   1    6    1    9   
##  [89] 9    3    3    6    baby 10   4    5   
##  [97] 3    1    9    8   
## Levels: 1 10 2 3 4 5 6 7 8 9 baby
```

What if we wanted to convert this to numeric?

---

```r
data.frame(age) %>% 
  count(age) %>% 
  mutate(age_numeric = as.numeric(age)) %>% 
  select(starts_with("age"), n)
```

```
##     age age_numeric  n
## 1     1           1  9
## 2    10           2 10
## 3     2           3  5
## 4     3           4 10
## 5     4           5  9
## 6     5           6 11
## 7     6           7  5
## 8     7           8  4
## 9     8           9 11
## 10    9          10 11
## 11 baby          11 15
```

--
These are the integers associated with the factor levels, so `as.numeric()` will not give us the results we want.

---
# Fix

First convert to character, then to numeric (you can ignore the warning in this case)

```r
data.frame(age) %>% 
  mutate(
    age_chr = as.character(age),
    age_num = ifelse(age_chr == "baby", 0, as.numeric(age_chr))
  ) %>% 
  count(age, age_chr, age_num)
```

```
##     age age_chr age_num  n
## 1     1       1       1  9
## 2    10      10      10 10
## 3     2       2       2  5
## 4     3       3       3 10
## 5     4       4       4  9
## 6     5       5       5 11
## 7     6       6       6  5
## 8     7       7       7  4
## 9     8       8       8 11
## 10    9       9       9 11
## 11 baby    baby       0 15
```

---
# Implementation of dates

```r
date <- Sys.Date()
typeof(date)
```

```
## [1] "double"
```

```r
attributes(date)
```

```
## $class
## [1] "Date"
```

```r
attributes(date) <- NULL
date
```

```
## [1] 19079
```

* This number represents the days passed since January 1, 1970, known as the Unix epoch.

---
# A bit more on classes
Why do these all print different things?

.pull-left[

```r
summary(mtcars[, 1:2])
```

```
##       mpg             cyl       
##  Min.   :10.40   Min.   :4.000  
##  1st Qu.:15.43   1st Qu.:4.000  
##  Median :19.20   Median :6.000  
##  Mean   :20.09   Mean   :6.188  
##  3rd Qu.:22.80   3rd Qu.:8.000  
##  Max.   :33.90   Max.   :8.000
```

```r
summary(gss_cat$marital)
```

```
##     No answer Never married     Separated 
##            17          5416           743 
##      Divorced       Widowed       Married 
##          3383          1807         10117
```
]

.pull-right[

```r
m <- lm(mpg ~ cyl, mtcars)
summary(m)
```

```
## 
## Call:
## lm(formula = mpg ~ cyl, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.9814 -2.1185  0.2217  1.0717  7.5186 
## 
## Coefficients:
##             Estimate Std. Error t value
## (Intercept)  37.8846     2.0738   18.27
## cyl          -2.8758     0.3224   -8.92
##             Pr(>|t|)    
## (Intercept)  < 2e-16 ***
## cyl         6.11e-10 ***
## ---
## Signif. codes:  
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.206 on 30 degrees of freedom
## Multiple R-squared:  0.7262,	Adjusted R-squared:  0.7171 
## F-statistic: 79.56 on 1 and 30 DF,  p-value: 6.113e-10
```
]

---
# What are the classes?

```r
class(mtcars[, 1:2])
```

```
## [1] "data.frame"
```

```r
class(gss_cat$marital)
```

```
## [1] "factor"
```

```r
class(m)
```

```
## [1] "lm"
```

---
# S3 methods

When you call `summary()`, it looks for a method (function) for that specific class.

--
`summary(mtcars[, 1:2])` becomes `summary.data.frame(mtcars[, 1:2])`

--
`summary(gss_cat$marital)` becomes `summary.factor(gss_cat$marital)`

--
`summary(m)` becomes `summary.lm(m)`

---
# Naming function

Because S3 is so common in R, I recommend against including dots in function names.

--
`summary.data.frame()` is less clear than it would be if it were `summary.data_frame()`

--
Classes and methods is not something I'm going to expect you to have a deep knowledge on, but I want you to be aware of it.

---
# Missing values

* Missing values breed missing values

```r
NA > 5
```

```
## [1] NA
```

```r
NA * 7
```

```
## [1] NA
```

* What about this one?

```r
NA == NA
```

```
## [1] NA
```

--
It is correct because there's no reason to presume that one missing
value is or is not equal to another missing value.

---
# When missing values don't propagate

```r
NA | TRUE
```

```
## [1] TRUE
```

```r
x <- c(NA, 3, NA, 5)
any(x > 4)
```

```
## [1] TRUE
```

---
# How to test missingness?

* We've already seen the following doesn't work

```r
x == NA
```

```
## [1] NA NA NA NA
```

--
* Instead, use `is.na`

```r
is.na(x)
```

```
## [1]  TRUE FALSE  TRUE FALSE
```

---
class: inverse-blue middle
# Lists

---
# Lists
* Lists are vectors, but not *atomic* vectors

* Fundamental difference - each element can be a different type

```r
list("a", 7L, 3.25, TRUE)
```

```
## [[1]]
## [1] "a"
## 
## [[2]]
## [1] 7
## 
## [[3]]
## [1] 3.25
## 
## [[4]]
## [1] TRUE
```

---
# Lists

.pull-left[

* Technically, each element of the list is a vector, possibly atomic

* The prior example included all *scalars*, which are vectors of length 1.

* Lists do not require all elements to be the same length

]

.pull-right[

```r
l <- list(
  c("a", "b", "c"),
  rnorm(5),
  c(7L, 2L),
  c(TRUE, TRUE, FALSE, TRUE)
)
l
```

```
## [[1]]
## [1] "a" "b" "c"
## 
## [[2]]
## [1]  0.19253010 -0.97561601 -0.06092112
## [4]  1.65972336 -0.07401147
## 
## [[3]]
## [1] 7 2
## 
## [[4]]
## [1]  TRUE  TRUE FALSE  TRUE
```
]

---
# Check the list

```r
typeof(l)
```

```
## [1] "list"
```

```r
attributes(l)
```

```
## NULL
```

```r
str(l)
```

```
## List of 4
##  $ : chr [1:3] "a" "b" "c"
##  $ : num [1:5] 0.1925 -0.9756 -0.0609 1.6597 -0.074
##  $ : int [1:2] 7 2
##  $ : logi [1:4] TRUE TRUE FALSE TRUE
```

---
# Data frames as lists
* A data frame is just a special case of a list, where all the elements are of the same length.

.pull-left[

```r
l_df <- list(
  a = c("red", "blue"),
  b = rnorm(2),
  c = c(7L, 2L),
  d = c(TRUE, FALSE)
)
l_df
```

```
## $a
## [1] "red"  "blue"
## 
## $b
## [1] 0.3921062 2.0518441
## 
## $c
## [1] 7 2
## 
## $d
## [1]  TRUE FALSE
```
]

.pull-right[

```r
data.frame(l_df)
```

```
##      a         b c     d
## 1  red 0.3921062 7  TRUE
## 2 blue 2.0518441 2 FALSE
```
]

---
class: inverse-red middle
# Subsetting Lists

---
# A nested list
Lists are often complicated objects. Let's create a somewhat complicated one

```r
x <- c(a = 3, b = 5, c = 7)
l <- list(
  x = x,
  x2 = c(x, x),
  x3 = list(
    vect = x,
    squared = x^2,
    cubed = x^3)
)
```

---
# Subsetting lists
Multiple methods
* Most common: `$`, `[`, and `[[`

.pull-left[

```r
l[1]
```

```
## $x
## a b c 
## 3 5 7
```

```r
typeof(l[1])
```

```
## [1] "list"
```

```r
l[[1]]
```

```
## a b c 
## 3 5 7
```
]

.pull-right[

```r
typeof(l[[1]])
```

```
## [1] "double"
```

```r
l[[1]]["c"]
```

```
## c 
## 7
```
]

---
# Which bracket to use?

.footnote[From [r4DS](https://r4ds.had.co.nz/vectors.html?q=list#lists-of-condiments)]

![](https://i.stack.imgur.com/6Vwry.png)

---
# Another analogy

.footnote[From [Advanced-R](https://adv-r.hadley.nz/subsetting.html)]

![](https://d33wubrfki0l68.cloudfront.net/1f648d451974f0ed313347b78ba653891cf59b21/8185b/diagrams/subsetting/train.png)

---
.footnote[From [Advanced-R](https://adv-r.hadley.nz/subsetting.html)]

![](https://d33wubrfki0l68.cloudfront.net/aea9600956ff6fbbc29d8bd49124cca46c5cb95c/28eaa/diagrams/subsetting/train-single.png)

---
.footnote[From [Advanced-R](https://adv-r.hadley.nz/subsetting.html)]

![](https://d33wubrfki0l68.cloudfront.net/ef5798a60926462b9fc080afb0145977eca70b83/039f5/diagrams/subsetting/train-multiple.png)
---
# Named list
* Because the elements of the list are named, we can also use `$`, just like with a data frame (which is a list)

```r
l$x2
```

```
## a b c a b c 
## 3 5 7 3 5 7
```

```r
l$x3
```

```
## $vect
## a b c 
## 3 5 7 
## 
## $squared
##  a  b  c 
##  9 25 49 
## 
## $cubed
##   a   b   c 
##  27 125 343
```

---
# Subsetting nested lists
* Multiple `$` if all named

```r
l$x3$squared
```

```
##  a  b  c 
##  9 25 49
```

* Note this doesn't work on named elements of an atomic vector, just the named elements of a list

```r
l$x3$squared$b
```

```
## Error in l$x3$squared$b: $ operator is invalid for atomic vectors
```

---
But we could do something like...

```r
l$x3$squared["b"]
```

```
##  b 
## 25
```

---
# Alternatives

* You can always use logical

* Indexing works too

.pull-left[

```r
l[c(TRUE, FALSE, TRUE)]
```

```
## $x
## a b c 
## 3 5 7 
## 
## $x3
## $x3$vect
## a b c 
## 3 5 7 
## 
## $x3$squared
##  a  b  c 
##  9 25 49 
## 
## $x3$cubed
##   a   b   c 
##  27 125 343
```
]

.pull-right[

```r
l[c(1, 3)]
```

```
## $x
## a b c 
## 3 5 7 
## 
## $x3
## $x3$vect
## a b c 
## 3 5 7 
## 
## $x3$squared
##  a  b  c 
##  9 25 49 
## 
## $x3$cubed
##   a   b   c 
##  27 125 343
```
]

---
# Careful with your brackets

```r
l[[c(TRUE, FALSE, FALSE)]]
```

```
## Error in l[[c(TRUE, FALSE, FALSE)]]: recursive indexing failed at level 2
```

* Why doesn't the above work?

---
## Subsetting in multiple dimensions

* Generally we deal with 2d data frames

* If there are two dimensions, we separate the `[` subsetting with a comma

```r
head(mtcars)
```

```
##                    mpg cyl disp  hp drat
## Mazda RX4         21.0   6  160 110 3.90
## Mazda RX4 Wag     21.0   6  160 110 3.90
## Datsun 710        22.8   4  108  93 3.85
## Hornet 4 Drive    21.4   6  258 110 3.08
## Hornet Sportabout 18.7   8  360 175 3.15
## Valiant           18.1   6  225 105 2.76
##                      wt  qsec vs am gear carb
## Mazda RX4         2.620 16.46  0  1    4    4
## Mazda RX4 Wag     2.875 17.02  0  1    4    4
## Datsun 710        2.320 18.61  1  1    4    1
## Hornet 4 Drive    3.215 19.44  1  0    3    1
## Hornet Sportabout 3.440 17.02  0  0    3    2
## Valiant           3.460 20.22  1  0    3    1
```

```r
mtcars[3, 4]
```

```
## [1] 93
```

---
# Empty indicators
* An empty indicator implies "all"

--
### Select the entire fourth column

```r
mtcars[ ,4]
```

```
##  [1] 110 110  93 110 175 105 245  62  95 123
## [11] 123 180 180 180 205 215 230  66  52  65
## [21]  97 150 150 245 175  66  91 113 264 175
## [31] 335 109
```

--
### Select the entire 4th row

```r
mtcars[4, ]
```

```
##                 mpg cyl disp  hp drat    wt
## Hornet 4 Drive 21.4   6  258 110 3.08 3.215
##                 qsec vs am gear carb
## Hornet 4 Drive 19.44  1  0    3    1
```

---
# Data types returned

* By default, each of the prior will return a vector, which itself can be subset

The following are equivalent

```r
mtcars[4, c("mpg", "hp")]
```

```
##                 mpg  hp
## Hornet 4 Drive 21.4 110
```

```r
mtcars[4, ][c("mpg", "hp")]
```

```
##                 mpg  hp
## Hornet 4 Drive 21.4 110
```

---
# Return a data frame
* Often, you don't want the vector returned, but rather the modified data frame.

* Specify `drop = FALSE`

```r
mtcars[ ,4]
```

```
##  [1] 110 110  93 110 175 105 245  62  95 123
## [11] 123 180 180 180 205 215 230  66  52  65
## [21]  97 150 150 245 175  66  91 113 264 175
## [31] 335 109
```

```r
mtcars[ ,4, drop = FALSE]
```

```
##                      hp
## Mazda RX4           110
## Mazda RX4 Wag       110
## Datsun 710           93
## Hornet 4 Drive      110
## Hornet Sportabout   175
## Valiant             105
## Duster 360          245
## Merc 240D            62
## Merc 230             95
## Merc 280            123
## Merc 280C           123
## Merc 450SE          180
## Merc 450SL          180
## Merc 450SLC         180
## Cadillac Fleetwood  205
## Lincoln Continental 215
## Chrysler Imperial   230
## Fiat 128             66
## Honda Civic          52
## Toyota Corolla       65
## Toyota Corona        97
## Dodge Challenger    150
## AMC Javelin         150
## Camaro Z28          245
## Pontiac Firebird    175
## Fiat X1-9            66
## Porsche 914-2        91
## Lotus Europa        113
## Ford Pantera L      264
## Ferrari Dino        175
## Maserati Bora       335
## Volvo 142E          109
```

---
# tibbles
* Note dropping the data frame attribute is the default for a `data.frame` but .b[.bolder[NOT]] a `tibble`.

```r
mtcars_tbl <- tibble::as_tibble(mtcars)
mtcars_tbl[ ,4]
```

```
## # A tibble: 32 × 1
##       hp
##    <dbl>
##  1   110
##  2   110
##  3    93
##  4   110
##  5   175
##  6   105
##  7   245
##  8    62
##  9    95
## 10   123
## # … with 22 more rows
```

---
# You can override this

```r
mtcars_tbl[ ,4, drop = TRUE]
```

```
##  [1] 110 110  93 110 175 105 245  62  95 123
## [11] 123 180 180 180 205 215 230  66  52  65
## [21]  97 150 150 245 175  66  91 113 264 175
## [31] 335 109
```

---
# More than two dimensions
* Depending on your applications, you may not run into this much

```r
array <- 1:12
dim(array) <- c(2, 3, 2)
array
```

```
## , , 1
## 
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
## 
## , , 2
## 
##      [,1] [,2] [,3]
## [1,]    7    9   11
## [2,]    8   10   12
```

---
# Subset array
### Select just the second matrix

```r
array[ , ,2]
```

```
##      [,1] [,2] [,3]
## [1,]    7    9   11
## [2,]    8   10   12
```

### Select first column of each matrix

```r
array[ ,1, ]
```

```
##      [,1] [,2]
## [1,]    1    7
## [2,]    2    8
```

---
# Back to lists
### Why are they so useful?
* Much more flexible

* Often returned by functions, for example, `lm`

```r
m <- lm(mpg ~ hp, mtcars)
str(m)
```

```
## List of 12
##  $ coefficients : Named num [1:2] 30.0989 -0.0682
##   ..- attr(*, "names")= chr [1:2] "(Intercept)" "hp"
##  $ residuals    : Named num [1:32] -1.594 -1.594 -0.954 -1.194 0.541 ...
##   ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ...
##  $ effects      : Named num [1:32] -113.65 -26.046 -0.556 -0.852 0.67 ...
##   ..- attr(*, "names")= chr [1:32] "(Intercept)" "hp" "" "" ...
##  $ rank         : int 2
##  $ fitted.values: Named num [1:32] 22.6 22.6 23.8 22.6 18.2 ...
##   ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ...
##  $ assign       : int [1:2] 0 1
##  $ qr           :List of 5
##   ..$ qr   : num [1:32, 1:2] -5.657 0.177 0.177 0.177 0.177 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ...
##   .. .. ..$ : chr [1:2] "(Intercept)" "hp"
##   .. ..- attr(*, "assign")= int [1:2] 0 1
##   ..$ qraux: num [1:2] 1.18 1.08
##   ..$ pivot: int [1:2] 1 2
##   ..$ tol  : num 1e-07
##   ..$ rank : int 2
##   ..- attr(*, "class")= chr "qr"
##  $ df.residual  : int 30
##  $ xlevels      : Named list()
##  $ call         : language lm(formula = mpg ~ hp, data = mtcars)
##  $ terms        :Classes 'terms', 'formula'  language mpg ~ hp
##   .. ..- attr(*, "variables")= language list(mpg, hp)
##   .. ..- attr(*, "factors")= int [1:2, 1] 0 1
##   .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. ..$ : chr [1:2] "mpg" "hp"
##   .. .. .. ..$ : chr "hp"
##   .. ..- attr(*, "term.labels")= chr "hp"
##   .. ..- attr(*, "order")= int 1
##   .. ..- attr(*, "intercept")= int 1
##   .. ..- attr(*, "response")= int 1
##   .. ..- attr(*, ".Environment")=<environment: 0x7f9f6cc12a28> 
##   .. ..- attr(*, "predvars")= language list(mpg, hp)
##   .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
##   .. .. ..- attr(*, "names")= chr [1:2] "mpg" "hp"
##  $ model        :'data.frame':	32 obs. of  2 variables:
##   ..$ mpg: num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##   ..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
##   ..- attr(*, "terms")=Classes 'terms', 'formula'  language mpg ~ hp
##   .. .. ..- attr(*, "variables")= language list(mpg, hp)
##   .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
##   .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. ..$ : chr [1:2] "mpg" "hp"
##   .. .. .. .. ..$ : chr "hp"
##   .. .. ..- attr(*, "term.labels")= chr "hp"
##   .. .. ..- attr(*, "order")= int 1
##   .. .. ..- attr(*, "intercept")= int 1
##   .. .. ..- attr(*, "response")= int 1
##   .. .. ..- attr(*, ".Environment")=<environment: 0x7f9f6cc12a28> 
##   .. .. ..- attr(*, "predvars")= language list(mpg, hp)
##   .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
##   .. .. .. ..- attr(*, "names")= chr [1:2] "mpg" "hp"
##  - attr(*, "class")= chr "lm"
```

---
# Summary
* Atomic vectors must all be the same type

+ implicit coercion occurs if not (and you haven't specified the coercion
  explicitly)

* Lists are also vectors, but not atomic vectors
  
  + Each element can be of a different type and length

+ Incredibly flexible, but often a little more difficult to get the hang of, particularly with subsetting

---
class: inverse-green middle
# Next time
### Loops with base R