---
title: "MAGPIE Class Tutorial"
author: "Jan Philipp Dietrich"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{MAGPIE Class Tutorial}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, echo = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```

This tutorial provides a basic introduction to magpie objects in R. If you would like to get more details on the concept of and the idea behind magpie objects you can have a look at [magclass-concept](magclass-concept.html), where this is explained in more detail by comparing the magpie class to other data classes in R.

## Generate a magpie object

Generation of magpie objects can either be done from scratch, by reading in files or by converting objects from other classes to magpie objects. For creation of a new magpie object you can use `new.magpie`, conversion of an existing object happens with `as.magpie`:

```{r, echo = TRUE}
library(magclass)

# creating a magpie object with 2 regions, 2 years and 2 different values
m <- new.magpie(cells_and_regions = c("AFR", "CPA"),
  years = c(1995, 2000),
  names = c("bla", "blub"),
  sets = c("region", "year", "value"),
  fill = 0)
print(m)

# converting a simple vector with one value per region to a magpie object
v <- c(ENG = 10, USA = 20, BRA = 30, CHN = 40, IND = 50)
m2 <- as.magpie(v)
str(m2)
```


In the example above the names were automatically detected as regions, if for some reason the automatic detection fails you can also indicate to what type of dimension the data belongs. Lets assume in the following example, that the names are not representing regions but something else. The argument `spatial=0` indicates the non-existence of a spatial dimension, but it can also be used to point to the spatial dimension in the data.

```{r, echo = TRUE}
m3 <- as.magpie(v, spatial = 0)
str(m3)
```

## Accessing magpie objects

For the following a example data set containing population data is used.

```{r, echo = TRUE}
pm <- maxample("pop")
```

### General properties

Let's first have a look at the content of the magpie object:

Show the structure of the object:
```{r, echo = TRUE}
str(pm)
```

Show the first elements:
```{r, echo = TRUE}
head(pm)
```
Show the last elements:
```{r, echo = TRUE}
tail(pm)
```

Which elements are there?
```{r, echo = TRUE}
getItems(pm)
```

Which spatial elements are there?
```{r, echo = TRUE}
getItems(pm, dim = 1)
```

Which scenarios?
```{r, echo = TRUE}
getItems(pm, dim = 3)
```

`getItems` as well as most of the other functions allow to select a dimension either via its dimension code (as done in the previous example), or via its dimension name:

```{r, echo = TRUE}
getItems(pm, dim = "scenario")
```

In terms of readability it is recommended to use the dimension name where possible, but one has to keep in mind that sometimes the dimension name might not be well defined and vary from case to case. In these instances it is safer to use the dimension code.

What are the sets (dimensions names) of the data?
```{r, echo = TRUE}
getSets(pm)
```

are there any comments which come with the data?
```{r, echo = TRUE}
getComment(pm)
```

let's have a look at a higher dimensional object

```{r, echo = TRUE}
a <- maxample("animal")
```

what is the full dimensionality of this object?
```{r, echo = TRUE}
getItems(a)
```

...split in sub-dimensions
```{r, echo = TRUE}
getItems(a, split = TRUE)
```

These functions can also be used to manipulate the object:

set a comment
```{r, echo = TRUE}
getComment(pm) <- "This is a comment!"
getComment(pm)
```
...or alternatively
```{r, echo = TRUE}
pm2 <- setComment(pm, "This is comment for pm2!")
getComment(pm2)
```
rename 1st region in "RRR" 
```{r, echo = TRUE}
getItems(pm, dim = 1)[1] <- "RRR"
```
rename region set in "zones" 
```{r, echo = TRUE}
getSets(pm)[2] <- "year"
```

### Subsets

Now that we have had a look into the structure of the object let's extract some subsets out of it. There are different methods that can be used to extract data from a magpie object. Here are some examples:

Return all A2 related data for LAM and the years 2005 and 2015
```{r, echo = TRUE}
pm["LAM", c(2005, 2015), "A2"]
```

Return data for regions which have "AS" in its name (pmatch allows for partial matching of the given search string)
```{r, echo = TRUE}
pm["AS", , , pmatch = TRUE]
```

If you want to specifically select from one dimension from which you have the dimension name:
```{r, echo = TRUE}
mselect(pm, scenario = "B1", i = c("FSU", "LAM"))
```
Or you can use alternatively:
```{r, echo = TRUE}
pm[list(i = c("FSU", "LAM")), , list(scenario = "B1")]
```


### Data transformations / calculations

Now we can perform some calculations with it.

take a subset of the data as an example
```{r, echo = TRUE}
d <- head(pm)
```

create a new object with some fancy calculations
```{r, echo = TRUE}
d2 <- d^2 + 12 * d + 99 / exp(d)
getItems(d2, dim = 3) <- c("NEWSCEN1", "NEWSCEN2")
getSets(d2)[3] <- "newscen"
d2
```

multiply both data sets with each other
```{r, echo = TRUE}
d <- d * d2
d
```
Because we changed the names of the elements in the data dimension it is assumed that they reflect different dimensions. Therefor the object is blown up in size instead of just having the elements multiplied pairwise with each other as it happens when two objects with identical dimensionality are multiplied with each other:

```{r, echo = TRUE}
d2 * d2
```

sum over the data dimension
```{r, echo = TRUE}
dimSums(d, dim = 3)
```
..over the second data dimension only
```{r, echo = TRUE}
dimSums(d, dim = 3.2)
```
..or alternatively addressed by name
```{r, echo = TRUE}
dimSums(d, dim = "newscen")
```
sum over regions and first data dimension
```{r, echo = TRUE}
dimSums(d, dim = c(1, 3.1))
```

apply a lowpass filter on the data
```{r, echo = TRUE}
lowpass(d)
```

do a linear interpolation of the data over time (yearly)
```{r, echo = TRUE}
time_interpolate(d[, , 1], 2005:2030)
```

split the data, do some calculations and bind it back together
```{r, echo = TRUE}
d1 <- d[, 1:3, ] * 100
d2 <- d[, 4:6, ] * (-1)
dd <- mbind(d1, d2)
dd
```

set all values greater than 0.5 to 0.51
```{r, echo = TRUE}
d[d > 0.5] <- 0.51
d
```

round the data
```{r, echo = TRUE}
round(d, 0)
```