![]() |
Generalized Additive Models (GAM) are an extension of Generalized Linear Models (GLM) that allow for flexibility in modeling nonlinear relationships between predictors and the outcome variable. Generalized Linear Models (GLM) are particularly useful when the relationship between the predictor variables and the response variable is not well-represented by a straight line. What is the Interaction Term?Interaction terms in GAMs allow us to explore how the effect of one predictor variable on the response variable changes at different levels of another predictor variable. In this article, we’ll demonstrate how to include an interaction term in a GAM using the Boston housing dataset. To perform this operation we will use the mgcv package in R, which is mainly used to fit the model. Basic Components of a GAMHere we will discuss the Basic Components of a GAM.
Generalized Additive Model on Boston Housing DatasetThe Boston housing dataset consists of various predictors such as crime rate (CRIM), average number of rooms per dwelling (RM), and proportion of owner-occupied units built before 1940 (AGE), among others, to predict the median value of owner-occupied homes (MEDV). We’ll be using a subset of these predictors to demonstrate the use of interaction terms in a GAM. Step 1: Install and load required librariesLoad Necessary Libraries in RStudios mentioned below if they are not installed before.
Step 2: Load and Prepare the DatasetFor this example, we are using the boston housing dataset as mentioned earlier in this article. Follow to below code to include this dataset in you application. Dataset Link: Boston housing dataset
Output: CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT MEDV
1 0.00632 18 2.31 0 0.538 6.575 65.2 4.0900 1 296 15.3 396.90 4.98 24.0
2 0.02731 0 7.07 0 0.469 6.421 78.9 4.9671 2 242 17.8 396.90 9.14 21.6
3 0.02729 0 7.07 0 0.469 7.185 61.1 4.9671 2 242 17.8 392.83 4.03 34.7
4 0.03237 0 2.18 0 0.458 6.998 45.8 6.0622 3 222 18.7 394.63 2.94 33.4
5 0.06905 0 2.18 0 0.458 7.147 54.2 6.0622 3 222 18.7 396.90 NA 36.2
6 0.02985 0 2.18 0 0.458 6.430 58.7 6.0622 3 222 18.7 394.12 5.21 28.7 Step 3: Fit a GAM with an Interaction TermWe’ll fit a GAM to predict MEDV using RM (average number of rooms per dwelling) and AGE (proportion of owner-occupied units built before 1940) with an interaction term. The interaction term will allow us to see how the relationship between RM and MEDV changes with different levels of AGE.
Output: Family: gaussian
Link function: identity
Formula:
mpg ~ s(hp) + s(wt) + ti(hp, wt)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 18.8436 0.4925 38.26 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(hp) 1.000 1.00 0.546 0.46840
s(wt) 1.000 1.00 6.853 0.01648 *
ti(hp,wt) 8.542 10.24 3.793 0.00706 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.917 Deviance explained = 94.5%
GCV = 4.6958 Scale est. = 3.0021 n = 32 Step 4:Visualize the Interaction TermVisualizing the interaction term can help us understand how RM and AGE interact to influence MEDV. We use the plot function in R for this purpose.
Output: ![]() Include an Interaction Term in GAM in R The plot will show how the effect of RM on MEDV changes with different levels of AGE. The image shows the output plots from a Generalized Additive Model (GAM) fitted with an interaction term between RM (average number of rooms per dwelling) and AGE (proportion of owner-occupied units built before 1940) from the Boston housing dataset. Here’s a list of breakdowns of what each plot implies: Top Left Plot (s(RM, 8.11))
Top Right Plot (s(AGE, 1.78))
Bottom Plot (ti(RM, AGE))
Overall, the model suggests that while RM has a notable non-linear effect on MEDV, the effect of AGE is relatively small, and there is a significant interaction between RM and AGE. ConclusionRelationships between predictors and the response variable can be understood more deeply when interaction terms are included in a GAM. Using the Boston housing dataset, this article got you through the steps of how to fit a GAM with an interaction term. This allows us to investigate the link that develops between the number of rooms per residence (RM) and the median value of dwellings (MEDV) at varying proportions of owner-occupied units constructed prior to 1940 (AGE). To find intricate links in your data, you can apply this method to additional datasets and predictors.Feel free to explore the world of datasets and visualizations with R and its packages. |
Reffered: https://www.geeksforgeeks.org
AI ML DS |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 20 |