Introduction

For this module we will be using R and Rstudio. The lesson assumes no prior knowledge of R, but some experience may be helpful to some learners. At the end of the lesson, learners will be able to:

-Open and type commands into R and/or Rstudio, and have them executed in the console

-Know how to get help with R errors or warnings

-Have an understanding that there are user/community contributions called ‘packages’ that can help with specific areas

-Calculate the phenotypic covariance matrix

-Compute the additive co-ancestry matrix a set of markers

-Impute missing marker data

-Estimate population structure using a PCA

-Run a GWAS for a trait of interest

Setup (to be done prior to module)

For this module you will need to download base R and rstudio (if you don’t already have them). For this lesson we will be using R 3.4.3 “Kite-Eating Tree”. There are distributions of base R for all operating major operating systems.

Download R first (to be done prior to module)

-MAC OS users go here: https://cran.r-project.org/bin/macosx/R-3.4.3.pkg

-Windows users go here: https://cran.r-project.org/bin/windows/base/

Download RStudio (to be done prior to module)

Rstudio is an IDE (integrated development environment) for R, and it streamlines writing of scripts, project management, and plotting - click the link below and choose “Rstudio desktop” (free version) and choose your OS version:

https://www.rstudio.com/products/rstudio/download/#download

Download Data Set (to be done prior to module)

For this module we will be using a Picea glauca (white spruce) dataset of 15 phenotypes, 1694 individual trees, and some 6000 SNP markers. The markers have no linkage map, nor does the data set include pedigree based information. Picea glauca like most conifers, has a very large estimated genome size of 16.15 pg or 2100cM (estimated genetic map size), and 12 autosomal chromosomes. This means that we have very low marker coverage - this is further detailed in the paper (Beaulieu et al. 2014). White spruce is widespread in the Boreal and used mostly in residential home construction in Canada and paper making.

We will use a slightly altered version of the original dataset for the sake of time; however, the original dataset and associated paper by Beaulieu et al. (2014) lives here if anyone is interested in taking a look.

http://datadryad.org/resource/doi:10.5061/dryad.6rd6f

Part 1: Getting used to and oriented to R

Open up RStudio

Since this is the first time we’ve opened R let’s list the objects in our workspace:

ls()
## character(0)

There should be no objects in your workspace. If there are you can wipe the workspace in rstudio using the tiny broom icon in the environment panel. Or, you can simply type this command (NOTE only use this command if you are certain you want to wipe all objects from the workspace).

rm(list=ls())

ls()
## character(0)

You can type these directly in the console, or preferrably in our script file. If we type commands in our console, they will be more difficult to recall or remember.

Load the phenotype data and understand its structure

Now that are workspace is a blank space we can start working with our spruce dataset. First we will load the phenotypes into the workspace:

phenos <- read.table("S1.txt", header = TRUE, sep="\t")

If you’re new to R, you’ll notice that when you executed this command on the console - nothing happened. That’s generally good news, as it means your command was successfully executed. You’ll notice that if you type ls() an object named "phenos".

The first thing to do before examining any data is examine it and visualize it. R has several useful commands to do this. First, let’s figure out the size of our data.

dim(phenos)
## [1] 1694   15

The output of this command tells us that we have an array of 1694 rows x 15 columns. The rows always come first, columns second in R. That’s great; however, we would like to get a better picture of the data. We’ll do that with the command str() or the structure command.

str(phenos)
## 'data.frame':    1694 obs. of  15 variables:
##  $ Tree    : Factor w/ 1694 levels "E560A3_1_001_5",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ q1      : num  2.854 1.015 2.006 2.023 0.764 ...
##  $ q2      : num  1.932 0.057 -0.972 -2.276 -2.041 ...
##  $ cellpop : num  1497 1278 1144 1536 1446 ...
##  $ coars   : num  274 327 356 297 292 ...
##  $ cryst   : num  2.64 2.58 2.46 2.72 2.54 ...
##  $ dens    : num  411 413 406 456 419 ...
##  $ mfa     : num  21.6 19.9 25.9 14.8 25.5 ...
##  $ moe     : num  8.9 9.01 6.64 13.54 6.89 ...
##  $ rad     : num  27.9 28.8 30.8 26.9 28.2 ...
##  $ rwidth  : num  3.51 2.87 3.7 2.58 2.82 ...
##  $ specsurf: num  403 363 353 367 379 ...
##  $ tad     : num  25.1 28.4 29.7 25.4 25.5 ...
##  $ cellwt  : num  1.91 2.09 2.16 2.12 1.99 ...
##  $ h97     : int  480 575 725 520 460 400 495 540 370 405 ...

This not only gives us the dimensions of the data, the datatype we are dealing with, the column names, and the column classes. What does str() mean though? We can ask R. Type the command below for any function and you’ll get the help file to pop up with any arguments that might be contained within that fuction.

?str()

There’s also another function summary() which produces a brief summary of all data within our data.frame() columns.

summary(phenos)
##              Tree            q1                 q2           
##  E560A3_1_001_5:   1   Min.   :-36.8760   Min.   :-5.237000  
##  E560A3_1_002_1:   1   1st Qu.: -0.9895   1st Qu.:-1.624250  
##  E560A3_1_003_4:   1   Median :  0.4155   Median :-0.692000  
##  E560A3_1_004_2:   1   Mean   :  0.0000   Mean   : 0.000006  
##  E560A3_1_004_3:   1   3rd Qu.:  1.7158   3rd Qu.: 0.405000  
##  E560A3_1_005_4:   1   Max.   : 13.0050   Max.   :26.453000  
##  (Other)       :1688                                         
##     cellpop           coars           cryst            dens      
##  Min.   : 972.5   Min.   :196.8   Min.   :2.409   Min.   :341.3  
##  1st Qu.:1265.0   1st Qu.:284.1   1st Qu.:2.583   1st Qu.:394.6  
##  Median :1361.0   Median :305.7   Median :2.614   Median :416.3  
##  Mean   :1379.7   Mean   :307.7   Mean   :2.616   Mean   :418.8  
##  3rd Qu.:1478.3   3rd Qu.:331.4   3rd Qu.:2.651   3rd Qu.:438.6  
##  Max.   :2037.2   Max.   :511.7   Max.   :2.829   Max.   :662.3  
##                                                                  
##       mfa              moe              rad            rwidth     
##  Min.   : 9.982   Min.   : 4.667   Min.   :21.73   Min.   :1.266  
##  1st Qu.:15.859   1st Qu.: 8.579   1st Qu.:27.29   1st Qu.:2.649  
##  Median :18.479   Median : 9.915   Median :28.65   Median :3.098  
##  Mean   :18.970   Mean   : 9.933   Mean   :28.61   Mean   :3.116  
##  3rd Qu.:21.624   3rd Qu.:11.182   3rd Qu.:29.83   3rd Qu.:3.551  
##  Max.   :34.206   Max.   :17.159   Max.   :35.05   Max.   :5.350  
##                                                                   
##     specsurf          tad            cellwt           h97       
##  Min.   :248.9   Min.   :20.82   Min.   :1.516   Min.   :210.0  
##  1st Qu.:357.8   1st Qu.:25.64   1st Qu.:1.918   1st Qu.:420.0  
##  Median :376.3   Median :26.74   Median :2.034   Median :500.0  
##  Mean   :377.8   Mean   :26.71   Mean   :2.044   Mean   :499.6  
##  3rd Qu.:397.4   3rd Qu.:27.73   3rd Qu.:2.156   3rd Qu.:575.0  
##  Max.   :497.5   Max.   :33.42   Max.   :3.255   Max.   :890.0  
## 

The summary() can be very useful for figuring out if there’s a difference between mean and median values for example.

Phenotypic correlations and phenotypic covariance matrix

Since phenotypic correlations are important to consider in multivariate breeders’ selection we can produce this matrix in R.

cor(phenos)

An error is generated Error in cor(phenos) : x must be numeric. If you remember from the str and summary commands, one of the columns is not numeric.

If we re-execute our command, this time without that column, it should work. We can do this with indexing.

cor(phenos[,-1])
##                     q1            q2    cellpop        coars       cryst
## q1        1.000000e+00  6.636573e-07  0.1126717  0.005282580  0.08424207
## q2        6.636573e-07  1.000000e+00  0.1327060 -0.003165443  0.01780984
## cellpop   1.126717e-01  1.327060e-01  1.0000000 -0.734802338  0.28978906
## coars     5.282580e-03 -3.165443e-03 -0.7348023  1.000000000 -0.24610054
## cryst     8.424207e-02  1.780984e-02  0.2897891 -0.246100539  1.00000000
## dens      1.553326e-01  1.800205e-01  0.3622618  0.348782785  0.04544474
## mfa      -5.583501e-03  3.200013e-02  0.1171037 -0.210611746 -0.42482115
## moe       8.928539e-02  5.144695e-02  0.1484318  0.234571698  0.46750141
## rad      -1.469887e-01 -1.259350e-01 -0.8916835  0.599320506 -0.24898820
## rwidth   -1.103279e-01 -1.783114e-01 -0.4775837  0.121632541 -0.14949548
## specsurf -8.920521e-02 -9.239948e-02  0.3449667 -0.867492315  0.15353612
## tad      -3.038410e-02 -8.775873e-02 -0.8758502  0.731012677 -0.24498786
## cellwt    8.732436e-02  9.601474e-02 -0.2741314  0.845108059 -0.13653765
## h97      -5.385958e-02 -3.510943e-02 -0.4832443  0.337976722 -0.10478333
##                 dens          mfa         moe         rad     rwidth
## q1        0.15533261 -0.005583501  0.08928539 -0.14698873 -0.1103279
## q2        0.18002049  0.032000128  0.05144695 -0.12593498 -0.1783114
## cellpop   0.36226185  0.117103666  0.14843176 -0.89168354 -0.4775837
## coars     0.34878278 -0.210611746  0.23457170  0.59932051  0.1216325
## cryst     0.04544474 -0.424821149  0.46750141 -0.24898820 -0.1494955
## dens      1.00000000 -0.138458829  0.54317426 -0.42162981 -0.5074612
## mfa      -0.13845883  1.000000000 -0.86515231 -0.05660189  0.2583345
## moe       0.54317426 -0.865152312  1.00000000 -0.20416534 -0.4571478
## rad      -0.42162981 -0.056601886 -0.20416534  1.00000000  0.5370733
## rwidth   -0.50746124  0.258334485 -0.45714784  0.53707331  1.0000000
## specsurf -0.72874941  0.251858441 -0.45715612 -0.21491294  0.1667078
## tad      -0.21723958 -0.150876729 -0.05722434  0.59396771  0.3320997
## cellwt    0.79392125 -0.213577129  0.45865483  0.14486824 -0.2106084
## h97      -0.20912127  0.063519823 -0.18034881  0.46891413  0.6033065
##             specsurf         tad      cellwt         h97
## q1       -0.08920521 -0.03038410  0.08732436 -0.05385958
## q2       -0.09239948 -0.08775873  0.09601474 -0.03510943
## cellpop   0.34496672 -0.87585024 -0.27413144 -0.48324426
## coars    -0.86749232  0.73101268  0.84510806  0.33797672
## cryst     0.15353612 -0.24498786 -0.13653765 -0.10478333
## dens     -0.72874941 -0.21723958  0.79392125 -0.20912127
## mfa       0.25185844 -0.15087673 -0.21357713  0.06351982
## moe      -0.45715612 -0.05722434  0.45865483 -0.18034881
## rad      -0.21491294  0.59396771  0.14486824  0.46891413
## rwidth    0.16670782  0.33209965 -0.21060843  0.60330651
## specsurf  1.00000000 -0.40721253 -0.97163670 -0.12919524
## tad      -0.40721253  1.00000000  0.35297580  0.42334539
## cellwt   -0.97163670  0.35297580  1.00000000  0.09946855
## h97      -0.12919524  0.42334539  0.09946855  1.00000000

We can exclude multiple columns by using the combine function c()

cor(phenos[,-c(1:3)])
##             cellpop      coars       cryst        dens         mfa
## cellpop   1.0000000 -0.7348023  0.28978906  0.36226185  0.11710367
## coars    -0.7348023  1.0000000 -0.24610054  0.34878278 -0.21061175
## cryst     0.2897891 -0.2461005  1.00000000  0.04544474 -0.42482115
## dens      0.3622618  0.3487828  0.04544474  1.00000000 -0.13845883
## mfa       0.1171037 -0.2106117 -0.42482115 -0.13845883  1.00000000
## moe       0.1484318  0.2345717  0.46750141  0.54317426 -0.86515231
## rad      -0.8916835  0.5993205 -0.24898820 -0.42162981 -0.05660189
## rwidth   -0.4775837  0.1216325 -0.14949548 -0.50746124  0.25833449
## specsurf  0.3449667 -0.8674923  0.15353612 -0.72874941  0.25185844
## tad      -0.8758502  0.7310127 -0.24498786 -0.21723958 -0.15087673
## cellwt   -0.2741314  0.8451081 -0.13653765  0.79392125 -0.21357713
## h97      -0.4832443  0.3379767 -0.10478333 -0.20912127  0.06351982
##                  moe         rad     rwidth   specsurf         tad
## cellpop   0.14843176 -0.89168354 -0.4775837  0.3449667 -0.87585024
## coars     0.23457170  0.59932051  0.1216325 -0.8674923  0.73101268
## cryst     0.46750141 -0.24898820 -0.1494955  0.1535361 -0.24498786
## dens      0.54317426 -0.42162981 -0.5074612 -0.7287494 -0.21723958
## mfa      -0.86515231 -0.05660189  0.2583345  0.2518584 -0.15087673
## moe       1.00000000 -0.20416534 -0.4571478 -0.4571561 -0.05722434
## rad      -0.20416534  1.00000000  0.5370733 -0.2149129  0.59396771
## rwidth   -0.45714784  0.53707331  1.0000000  0.1667078  0.33209965
## specsurf -0.45715612 -0.21491294  0.1667078  1.0000000 -0.40721253
## tad      -0.05722434  0.59396771  0.3320997 -0.4072125  1.00000000
## cellwt    0.45865483  0.14486824 -0.2106084 -0.9716367  0.35297580
## h97      -0.18034881  0.46891413  0.6033065 -0.1291952  0.42334539
##               cellwt         h97
## cellpop  -0.27413144 -0.48324426
## coars     0.84510806  0.33797672
## cryst    -0.13653765 -0.10478333
## dens      0.79392125 -0.20912127
## mfa      -0.21357713  0.06351982
## moe       0.45865483 -0.18034881
## rad       0.14486824  0.46891413
## rwidth   -0.21060843  0.60330651
## specsurf -0.97163670 -0.12919524
## tad       0.35297580  0.42334539
## cellwt    1.00000000  0.09946855
## h97       0.09946855  1.00000000
cor(phenos[,-c(1,4:9,10)])
##                     q1            q2     rwidth    specsurf         tad
## q1        1.000000e+00  6.636573e-07 -0.1103279 -0.08920521 -0.03038410
## q2        6.636573e-07  1.000000e+00 -0.1783114 -0.09239948 -0.08775873
## rwidth   -1.103279e-01 -1.783114e-01  1.0000000  0.16670782  0.33209965
## specsurf -8.920521e-02 -9.239948e-02  0.1667078  1.00000000 -0.40721253
## tad      -3.038410e-02 -8.775873e-02  0.3320997 -0.40721253  1.00000000
## cellwt    8.732436e-02  9.601474e-02 -0.2106084 -0.97163670  0.35297580
## h97      -5.385958e-02 -3.510943e-02  0.6033065 -0.12919524  0.42334539
##               cellwt         h97
## q1        0.08732436 -0.05385958
## q2        0.09601474 -0.03510943
## rwidth   -0.21060843  0.60330651
## specsurf -0.97163670 -0.12919524
## tad       0.35297580  0.42334539
## cellwt    1.00000000  0.09946855
## h97       0.09946855  1.00000000

Let’s generate the phenotypic covariance matrix, that we could actually use in a breeding program.

pheno_covar <- cov(phenos[,-c(1:3)])
pheno_cor <- cor(phenos[,-c(1:3)])

Visualize the data

R comes with numerous helpful commands for plotting and visualizing data. For our phenotypic dataset, let’s use a couple basic plots to get a sense of our trait data

hist(phenos$cryst)

plot(phenos$coars ~ phenos$cryst)

pairs(pheno_cor)

Using packages

One of the other positives with using R is that its open-source community contributes many pacakges. A package is a set of functions contributed by a/many users for a specific area of study. Packages may have dependencies on other packages. R has a very active community, and there are many thousands of packages that users can take advantage of, the field of genetics is no exception.

For this lesson, we will use a couple of pkgs rrBLUP from Jeff Endelman’s lab at the University of Wisconsin Madison. We will install the package and load it into the R workspace with the function:

install.packages("rrBLUP")
library(rrBLUP)

If we ever want to read more about a package we can use the citation function, that is: citation(rrBLUP) to read more about who generated the package or help(), that is: help(rrBLUP).

citation("rrBLUP")
## 
## To cite the rrBLUP package in publications use:
## 
##   Endelman, J.B. 2011. Ridge regression and other kernels for
##   genomic selection with R package rrBLUP. Plant Genome 4:250-255.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Article{,
##     title = {Ridge regression and other kernels for genomic selection with R package rrBLUP},
##     author = {J. B. Endelman},
##     journal = {Plant Genome},
##     year = {2011},
##     volume = {4},
##     pages = {250-255},
##   }
help(rrBLUP)
## No documentation for 'rrBLUP' in specified packages and libraries:
## you could try '??rrBLUP'

Part 2: Quantitative genetics in R

Load the genotype data

R can read in different data types, and R packages expand that capability. Above, we read in a tab-delimited text file and it’s not uncommon to read in csv (comma separated value files). There are even numerous packages that allow you to read in native ‘.xls’ and/or ‘.xlsx’ files.

We will load an ‘.RData’ file, which has been pre-processed in advance, which contains a single object genos, however note that RData files can contain multiple objects if you choose. We’ll also load the linkage map (this is a fake made up linkage map, with a 13th chromosome for one of the organelles or unknown SNPs).

load("genos.RData")
dim(genos.rrBLUP)
## [1] 1694 6716
marker.map <- read.csv("map.csv", header = TRUE)
marker.map <- marker.map[,-1]

What happens if we use the functions above to get an idea of the data structure? Type them in and see what happens.

Could we use our indexing technique from the phenotypic correlations to get an idea?

On your own see if you can print a small subset of the object to get an idea of its structure.

Compute the additive matrix and impute markers

We have no information on pedigree with this dataset, and as such we really can’t differentiate between IBD (identity-by-descent) and IBS (identity-by-state). The more SNPs, the more likely IBS starts to approximates IBD. Nonetheless, we’ll proceed with this toy example. We’ll use the function A.mat from the rrBLUP package we previously loaded. Let’s first look at what this function does by typing its name in the console, and we can also find out more by doing what?

library(rrBLUP)
A.mat.spruce <- A.mat(genos.rrBLUP, min.MAF=0.05,
                      impute.method="mean", 
                      return.imputed=TRUE)
dim(A.mat.spruce)
## NULL

You’ll notice here that I asked also for the option to return an imputed matrix based on the mean of each marker column. This is for downstream analyses, as many GS (genomic selection) and GWAS canned packages/functions require no missing data. I also chose the method that just happens to be the quickest for imputation - NOT THE MOST ACCURATE.

What type of object is retured? It’s a list, which is a special data type that allows the storing of different object types, of different lengths.

We can unlist() both of those objects for different purposes. For the next step we’ll want to use the imputed marker matrix.

marker_imputed <- unlist(A.mat.spruce$imputed)

Get an estimate of population structure

We don’t know much about this dataset, save that it’s a bunch of open pollinated individual trees in Canada, whose parents originated from three different latitudes/altitudes in the Province of Quebec and are included in the data set.

While kinship is probably the most important type of structure to consider it’s useful to get an estimate of population structure nonetheless. We will use principal component analysis (PCA), not to be confused with principal co-ordinate analysis (PCoA). Here we are using one of the base functions for principal co-ordinate analysis, but for larger dataset there are other options: EIGENSTRAT, flash-pca, SNPRelate (another R package), and many more.

We will use another nice plotting library to look at our PCs - ggplot2. What do you notice about the plot that strikes you given the limited information you have on the lines?

pca.spruce <- prcomp(marker_imputed)
str(pca.spruce)

library(ggplot2)
PCi<-data.frame(pca.spruce$x)
ggplot(PCi,aes(x=PC1,y=PC2)) +
   geom_point(size=3,alpha=0.5)

ggplot(PCi,aes(x=PC1,y=PC2)) +
   geom_point(size=3,alpha=0.5, color="orange") + theme_classic()

Perform a GWAS

Even though we have a pretty good estimate of coancestry and population structure, and it’s normally a good idea to include these in GWAS, here we will perform a GWAS with kinship included as the additive co-ancestry GWAS due to the local computation limits of this exercise. If you would like, you can try including the coancestry matrix (I have) and the PCs in your own GWAS. We’ll choose the tree height from 1997, at 22 years of age, and include the tree id in a data.frame.

?GWAS
pheno <- read.table("S1.txt", header=TRUE)
h97 <- data.frame(pheno$Tree, pheno$h97)
markerT <- data.frame(t(marker_imputed))
colnames(markerT) <- h97$pheno.Tree
genogwas <- marker.map[marker.map$Marker_Name %in% rownames(markerT),]
genogwas <- data.frame(genogwas, markerT)

Ok, now let’s run our GWAS, we’ll remove the default min.MAF = 0.05, since we already did this when we calculated the additive matrix.

GWAS(pheno = h97, geno = genogwas, 
                   min.MAF = 0.05, 
                   K = A.mat.spruce$A,
                   n.core = 4,
                   P3D = TRUE,
                   plot = TRUE)

Lesson “Easter Eggs” (not actually covered in the lesson)

Sometimes it’s useful to gather scripts that you can use or modify for the task at hand. A few of these are provided below, but we won’t cover them today. You may want to try it out at a later date.

Recode nucleotide data to a numeric format

Recoding SNP nuleotide data is one of the things that is a common requisite for many packages for quantitative and population genetics. E.g. taking “AA”, “CC”, “AC”, and translating it to 2, 1, 0 or 1,0,-1. There are other formats too. The function below can do this quickly (try it out later if you’d like).

The original spruce dataset from Data Dryad is loaded into the workspace, we define our function, and then we apply it to the genos.

genos <- read.table("S2.txt", header = TRUE, sep="\t", stringsAsFactors = TRUE)

genos[1:5,1:5]
##        X.Marker. PGAS1_00001 PGAS1_00002 PGAS1_00003 PGAS1_00004
## 1 E560A3_1_001_5         G:G         A:A         A:A         C:C
## 2 E560A3_1_002_1         G:G         A:A         G:G         C:C
## 3 E560A3_1_003_4         G:G         A:A         A:A         C:C
## 4 E560A3_1_004_2         G:G         A:A         G:G         C:G
## 5 E560A3_1_004_3         G:G         A:A         A:A         C:C
recode.SNP.rrBLUP <- function(x) {
  alleles <- unique(x)
  y <- rep(0,length(x))
  y[which(x==alleles[1])] <- -1
  y[which(x==alleles[2])] <- 0
  y[which(x==alleles[3])] <- 1
  y[which(x=="N")] <- NA
  return(y)
}


X <- apply(genos[,-1],2,recode.SNP.rrBLUP)

X[1:5,1:5]
##      PGAS1_00001 PGAS1_00002 PGAS1_00003 PGAS1_00004 PGAS1_00005
## [1,]          -1          -1          -1          -1          -1
## [2,]          -1          -1           0          -1          -1
## [3,]          -1          -1          -1          -1          -1
## [4,]          -1          -1           0           0          -1
## [5,]          -1          -1          -1          -1          -1

If you want the 0, 1, 2 format that is often the required by other programs, you can simply (this is required for SNPRelate and PLINK uses it too):

Xadd <- X + 1

Xadd[1:5,1:5]
##      PGAS1_00001 PGAS1_00002 PGAS1_00003 PGAS1_00004 PGAS1_00005
## [1,]           0           0           0           0           0
## [2,]           0           0           1           0           0
## [3,]           0           0           0           0           0
## [4,]           0           0           1           1           0
## [5,]           0           0           0           0           0

Heritability and response to selection

The authors provide estimates of narrow-sense heritability in their paper in Table 1 (Beaulieu et al. 2014), but don’t provide details on location, environment, or pedigree. As such, we can’t calculate heritability using a mixed model or linear model approach. Fortunately, we can estimate the heritability of single traits based on markers alone, by making use of the additive co-ancestry matrix that we’ve calculated already in the lesson above.

library(heritability)
A <- A.mat.spruce$A
colnames(A) <- pheno$Tree
rownames(A) <- pheno$Tree

h2_h97 <- marker_h2(pheno$h97, pheno$Tree, K=A)

h2_h97$h2

Since we have the narrow sense heritability (h2), and we have the phenotypic variances (Vp), we can easily calculate the additive variance (Va) of each of the trait, by re-arranging the equation h2 = Va/Vp to Va= h2 * Vp. Note that the function marker_h2 calculates both as well, we are merely going through the motions here to understand Gmatrix construction. We can also use the h2 in assessing the response to various selection coefficients.

va_h97 <- h2_h97$h2 * var(pheno$h97)

S = 10
R = h2_h97$h2 * S
R