housing <- read.csv("Datasets/housing_data_section_2_exercises.csv")
# Quick looks
head(housing)
str(housing)
summary(housing)
n <- length(housing$Price)
# I had a couple of q's in the previous week around what's happening
# with the length variables - sorry if that wasn't clear
# I'm literally just using it to get a sample size for the
# different formulas we need to run - length() returns
# the number of observations in the dataset for that variable.Week 10 Exercise
Interactions (Maine Housing Data)
Dataset: Datasets/housing_data_section_2_exercises.csv
Theme of the week: We extend multiple regression to allow the effect of one predictor (X) to depend on the level of another predictor (Z). In additive models, the X→Y slope is the same at all levels of Z; in interaction models, we allow the X→Y slope to change with Z.
Hypotheses we will preview and practice testing today:
- Overall model significance (is Model A - an interaction model - better than additive/multiple regression Model C?)
- Significance of the interaction term (are the lines non-parallel enough for an interaction to outperform and additive model?)
- Simple slopes (what is the effect/slope of X at a specific Z?)
- “Simple simple effect” (what does the intercept mean under centering?)
Part 0: Load & Refamiliarize with Data
Part 1: Additive vs. Interaction (House “Size-ish”)
Research question: Does the relationship between # of Bedrooms and Price depend on # of Bathrooms? (Equivalently: are regression lines for Price by Bedrooms parallel across different numbers of Bathrooms?)
# Additive Model (we don't allow an interaction; main effects only)
m_add_bedbath <- lm(Price ~ Bedrooms + Bathrooms, data = housing)
summary(m_add_bedbath)
# Interaction Model (allow slopes to vary with levels of the other predictor)
m_int_bedbath <- lm(Price ~ Bedrooms * Bathrooms, data = housing)
summary(m_int_bedbath)
# Note that interaction model specifications are one of the few instances
# in which we can rely on R to be pretty smart - it will include the
# lower order main effects even if we only specify the interaction term.
# This will be obvious in the summary output of the model below.PRE for Adding the Interaction Term
REMINDER: our model comparison is now Model A (interaction) vs. Model C (additive/multiple regression/no interaction). I mean, it could be something else as we’ll discuss in a bit but this is probably the most common place to start when examining interactions.
So, the logic here is still the same; we’re going to compare the model we’re interested in — one in which we allow the x-y relationship to vary/depend on the value of z (and vice versa, the z-y relationship to vary with x) - to one in which we do not allow those simple relationships/simple slopes to vary.
We will compare that more complicated (or augmented), interaction Model A to the simpler (more compact), additive Model C and then compare the SSE of each model using the same formula for PRE we’ve been using all semester. This will allow us to determine the proportion of additional error explained in the more complicated model which we can then use to evaluate whether or not the additional explanatory value of Model A is worth the increased complication or if we’re better off sticking with the simpler Model C (if that sounds unfamiliar, spend a little time on the discussion of the PRE formula from week 2 ).
Here’s what that looks like:
SSE_add <- sum(residuals(m_add_bedbath)^2)
SSE_int <- sum(residuals(m_int_bedbath)^2)
# PRE for interaction relative to additive:
PRE_int_over_add <- (SSE_add - SSE_int) / SSE_add
PRE_int_over_addSo that’s our PRE—we reduce the error present in the simple model by an additional .06% by adding the interaction term… that’s… not a lot. Still, let’s see this through and calculate the F and p for this model comparison.
# Partial-F for interaction (1 df numerator because we add one parameter: the product term)
df_small <- df.residual(m_add_bedbath)
df_big <- df.residual(m_int_bedbath)
df_num <- df_small - df_big # should be 1 for this exercise
df_den <- df_big
F_interaction <- ((SSE_add - SSE_int) / df_num) / (SSE_int / df_den)
p_interaction <- pf(F_interaction, df1 = df_num, df2 = df_den, lower.tail = FALSE)
F_interaction
p_interactionSo, our F for this model comparison examining the PRE of including versus not including the interaction term is equal to 0.13 and the p value is equal to 0.72. Not significant, as expected. We fail to reject the null and stick with our compact, additive model.
But real quick… for this specific model comparison examining whether or not the addition of the interaction term resulted in a meaningful PRE, was it necessary to calculate this on our own or was it present in the initial model summary, and if so, where? Let’s look:
summary(m_int_bedbath)
F_interaction
p_interactionCan you spot the test of this model comparison in the output? The p-value is pretty easy to spot because it’s identical, maybe just a rounding difference. And, appropriately, it’s in the Bedrooms:Bathrooms row in the summary table (that’s the interaction). The F-statistic is a bit trickier because it’s not there in F form, it’s there as a t, though. Remember how F is just a t squared? Let’s square that t and see what we get:
(-0.363)^2Well, that’s pretty close but not exact. WTF is that about. Rounding. It’s about the rounding that’s taking place when summary puts the model estimates in the nice table for us. Which is nice and we appreciate it but can mean that when we try to directly reproduce things they are ever so slightly off (sidenote—keep this in mind for submitting homeworks… if your answer is REALLY, REALLY close to one of the options, it’s probably a rounding difference…this is why I never give you SUPER close alternatives in multiple choice and why I grade any open response items “by hand” so you don’t lose points on minor rounding issues).
So, can we recreate this by telling summary to knock that rounding off? Sure can.
format(summary(m_int_bedbath)$coefficients, digits = 10)
(-.3625114352)^2Perfect. So, if we want to do the Model A (interaction) versus Model C (additive) comparison, we can just interpret the summary model output for the interaction model. It’s all there.
Interpretation guide:
- The p-value for the interaction tests the non-parallel-lines hypothesis—in other words, does the slope change enough along different values of the other predictor variable for it to be worth including the interaction term or are we better off sticking with the simpler model that assumes parallel lines?
- If significant, the Bedrooms:Price slope depends on Bathrooms and vice-versa.
- If non-significant, the slopes don’t vary enough for it to be worth needlessly complicating things.
Optional Visualization
We’re going to define bath_lo and bath_hi here so we can produce this visualization. We’ll return to these variables and discuss them more thoroughly in Part 3.
# Define low and high bathroom values (mean ± 1 SD) for visualization
bath_mean <- mean(housing$Bathrooms, na.rm = TRUE)
bath_sd <- sd(housing$Bathrooms, na.rm = TRUE)
bath_lo <- bath_mean - bath_sd
bath_hi <- bath_mean + bath_sd
# Very quick visual: two Bathrooms slices (low/high) for Bedrooms → Price
x <- seq(min(housing$Bedrooms), max(housing$Bedrooms), length.out = 80)
lo <- data.frame(Bedrooms = x, Bathrooms = bath_lo)
hi <- data.frame(Bedrooms = x, Bathrooms = bath_hi)
plot(housing$Bedrooms, housing$Price, pch = 16, cex = .6,
xlab = "Bedrooms", ylab = "Price", main = "Additive vs Interaction (slices)")
lines(x, predict(m_add_bedbath, newdata = lo), lty = 2)
lines(x, predict(m_add_bedbath, newdata = hi), lty = 2)
lines(x, predict(m_int_bedbath, newdata = lo))
lines(x, predict(m_int_bedbath, newdata = hi))
legend("topleft", bty = "n", c("Add low", "Add high", "Int low", "Int high"),
lty = c(2, 2, 1, 1))Finding a More Interesting Interaction
All that said, the next thing we’re going to do is start further interpreting or breaking down an interaction. You can run the models below on a non-significant interaction… but frankly there isn’t much of a point in doing so unless you have a REALLY good reason to. So let’s examine one more interaction model really quickly before we move on that will provide an interaction that looks at least slightly more interesting…
summary(lm(Price ~ Num_Dependents * Buyer_Income, data = housing))So, we’ve got a model with an interaction included and the interaction isn’t quite significant but it’s close. Now, here’s where we need to be a bit more sophisticated than someone who’s not particularly well-trained, at least in how we write up our results. Some people learn analyses not only without any underlying intuition for what research questions they’re asking or calculations they’re specifying but they learn a decision-tree style approach to probing interactions. If x, then y. If p < .05, then probe simple effects. If not, you simply can’t do anything, nothing to see here, move on.
But remember, we get to decide what kind of effects matter and what kind of false positive and false negative rate we’re willing to accept. And if I think that there’s reason to believe that perhaps this particular study is a bit underpowered — because it’s noisy, or the sample isn’t as big as I might like, or I’m just willing to reject a null with p < .1 instead of p < .05, I can do that. I’m not saying you should. In fact, I’d caution against doing so, particularly if you think there’s a chance that you’re being motivated to find an effect you were looking for/are already convinced of…but I am saying that you get to make the judgment calls here and R is just a silly tool that runs the numbers for you. So, if I’m comfortable with a different alpha/p-value being sufficiently of interest to probe, that’s my call.
What next? Well, just like with a simple regression we wouldn’t stop at saying “x significantly predicts y…,” we’d need to make clear the nature (positive or negative) and extent or magnitude of that relationship. In this case since we’ve decided this marginal interaction (marginal is a term that people sometimes use to refer to effects that approach the conventional p < .05 threshold but don’t quite make it) we need to make sure we understand how the x-y relationship changes with z.
Part 2: Probing a Marginal Interaction
Research question: Does the effect of Buyer_Income on Price depend on Num_Dependents?
2A) Fit Additive and Interaction Models
m_add_dep_inc <- lm(Price ~ Num_Dependents + Buyer_Income, data = housing)
m_int_dep_inc <- lm(Price ~ Num_Dependents * Buyer_Income, data = housing)
summary(m_add_dep_inc)
summary(m_int_dep_inc)
# Look at the Num_Dependents:Buyer_Income line (marginal/near p < .10)2B) PRE for Adding the Interaction Term
Model A = interaction; Model C = additive
SSE_add_DI <- sum(residuals(m_add_dep_inc)^2)
SSE_int_DI <- sum(residuals(m_int_dep_inc)^2)
PRE_int_over_add_DI <- (SSE_add_DI - SSE_int_DI) / SSE_add_DI
PRE_int_over_add_DI
# Partial-F test (numerator df = 1 because we add a single product term)
df_small_DI <- df.residual(m_add_dep_inc)
df_big_DI <- df.residual(m_int_dep_inc)
df_num_DI <- df_small_DI - df_big_DI
df_den_DI <- df_big_DI
F_int_DI <- ((SSE_add_DI - SSE_int_DI) / df_num_DI) / (SSE_int_DI / df_den_DI)
p_int_DI <- pf(F_int_DI, df1 = df_num_DI, df2 = df_den_DI, lower.tail = FALSE)
F_int_DI
p_int_DIOK, now we’re going to go ahead and probe simple slopes to learn the pattern.
2C) Simple Slopes by Recoding Z
Strategy: Define Num_Dependents_centered = Num_Dependents - z0. Then fit: Price ~ Buyer_Income * Num_Dependents_centered. In that model, the coefficient on Buyer_Income IS the simple slope at z0.
One thing we can do is to pick common descriptive values — mean and +/- 1 SD:
# Pick three z0 values: low, mean, high
z_mean <- mean(housing$Num_Dependents, na.rm = TRUE)
z_sd <- sd(housing$Num_Dependents, na.rm = TRUE)
z_lo <- z_mean - z_sd
z_hi <- z_mean + z_sd
z_mean; z_lo; z_hiSimple Slope at Num_Dependents = Mean
housing$ND_c_mean <- housing$Num_Dependents - z_mean
m_ss_mean <- lm(Price ~ Buyer_Income * ND_c_mean, data = housing)
summary(m_ss_mean)
confint(m_ss_mean) # to report a CI for the simple slope at z0...but...
# we did decide we were more interested in a CI with a=.9, so we can also run:
confint(m_ss_mean, level = .9)
# which is a bit more consistent with that decision
# Pull out the simple slope line for your write-up:
# The coefficient on Buyer_Income in m_ss_mean is the simple slope at ND = mean.
coef(summary(m_ss_mean))["Buyer_Income", ]So, we could say that for households with an average number of dependents (1.93), as buyer annual household income increased by $1, we saw an additional ~$0.024 increase in purchase price of a home.
What about at those low and high values in our sample?
Simple Slope at Num_Dependents = Low (Mean - 1 SD)
housing$ND_c_lo <- housing$Num_Dependents - z_lo
m_ss_lo <- lm(Price ~ Buyer_Income * ND_c_lo, data = housing)
summary(m_ss_lo)
confint(m_ss_lo, level = .9)
coef(summary(m_ss_lo))["Buyer_Income", ]Interesting… so, for our household with a low number of dependents for our particular sample, we see a different relationship between annual household income of our buyers and purchase price. Our model estimate of the simple slope for Buyer Income predicting price for those with a low number of dependents (0.5) is that as that annual household income increases by $1, we actually predict a purchase price that is ~$0.23 lower.
OK, now how about for people on the high end of number of dependents for this particular sample? What’s the relationship between household income and house price for them?
Simple Slope at Num_Dependents = High (Mean + 1 SD)
housing$ND_c_hi <- housing$Num_Dependents - z_hi
m_ss_hi <- lm(Price ~ Buyer_Income * ND_c_hi, data = housing)
summary(m_ss_hi)
confint(m_ss_hi, level = 0.9)
coef(summary(m_ss_hi))["Buyer_Income", ]For those with a relatively high number of dependents in our sample (3.36), we see a different simple slope prediction in our model. Specifically, as annual household income increases by $1 we predict that house purchase prices go up by ~$0.27—so the opposite pattern we saw in the low number of dependents centered version of the model.
Simple Slopes at Specific Values
So, those are reasonable and somewhat common values we could test (mean and +/- 1 SD) but the thing about each is that sometimes they don’t make sense for variables for which only whole numbers actually exist (like number of dependents). So, we can also specify values of interest that exist and examine the simple slopes at those actually possible values. In this particular case, the range in our sample is 0 to 4, so let’s just actually check 0, 1, 2, 3, and 4 dependents.
## ND = 0
housing$ND_c_0 <- housing$Num_Dependents - 0
m_ss_ND0 <- lm(Price ~ Buyer_Income * ND_c_0, data = housing)
summary(m_ss_ND0)
confint(m_ss_ND0, level = 0.90)
coef(summary(m_ss_ND0))["Buyer_Income", ]
## ND = 1
housing$ND_c_1 <- housing$Num_Dependents - 1
m_ss_ND1 <- lm(Price ~ Buyer_Income * ND_c_1, data = housing)
summary(m_ss_ND1)
confint(m_ss_ND1, level = 0.90)
coef(summary(m_ss_ND1))["Buyer_Income", ]
## ND = 2
housing$ND_c_2 <- housing$Num_Dependents - 2
m_ss_ND2 <- lm(Price ~ Buyer_Income * ND_c_2, data = housing)
summary(m_ss_ND2)
confint(m_ss_ND2, level = 0.90)
coef(summary(m_ss_ND2))["Buyer_Income", ]
## ND = 3
housing$ND_c_3 <- housing$Num_Dependents - 3
m_ss_ND3 <- lm(Price ~ Buyer_Income * ND_c_3, data = housing)
summary(m_ss_ND3)
confint(m_ss_ND3, level = 0.90)
coef(summary(m_ss_ND3))["Buyer_Income", ]
## ND = 4
housing$ND_c_4 <- housing$Num_Dependents - 4
m_ss_ND4 <- lm(Price ~ Buyer_Income * ND_c_4, data = housing)
summary(m_ss_ND4)
confint(m_ss_ND4, level = 0.90)
coef(summary(m_ss_ND4))["Buyer_Income", ]Simple Slopes the Other Way
Now you might be wondering “but what about the simple slopes for Price ~ Number of Dependents at different levels of buyer income? Why didn’t we examine things that way?” And to that I say… I don’t know, I just made a call. But that’s the thing about interpreting and probing an interaction, you need to make some judgment calls and often there’s no clear right or wrong answer, just you deciding how to approach understanding and communicating about what you learn from data.
That said, typically you wouldn’t report simple effects both ways—you’d pick one and stick to it. And sometimes there is an obvious way in which to take those simple effects—like examining the effect of age on race performance at different levels of training makes more intuitive sense (at least to me and the authors of your textbook) than examining the effect of different levels of training on race performance at different ages but you could certainly choose to analyze the data and tell either story. So, even though you wouldn’t really do this in practice, let’s examine our simple effects in the other way right now just for completeness and practice.
income_med_single <- 34394 # this is the median income for a single person household in ME
income_med_household <- 71773 # this is the median household income in ME
housing$Income_med_single <- housing$Buyer_Income - income_med_single
m_inc_med_single <- lm(Price ~ Num_Dependents * Income_med_single, data = housing)
summary(m_inc_med_single)
confint(m_inc_med_single, level = 0.90)
coef(summary(m_inc_med_single))["Num_Dependents", ]
housing$Income_med_household <- housing$Buyer_Income - income_med_household
m_inc_med_household <- lm(Price ~ Num_Dependents * Income_med_household, data = housing)
summary(m_inc_med_household)
confint(m_inc_med_household, level = 0.90)
coef(summary(m_inc_med_household))["Num_Dependents", ]So, at each of these two levels of income, the purchase price of a house decreases significantly for each additional dependent in the house, by about $15,370 for households making the median income for a single earner household and about $8,827 for every additional dependent in household making about the median household income (not limited to single earner households).
It would be cool if we could get not only estimates of how simple slopes change at specified values of one part of an interaction but also estimates of the DV for those specified values—namely, the home price.
Do I have good news for you. We can. Specifically, we can interpret the intercept as a so-called simple-simple effect at the level at which our predictor variables are coded as 0. On to that next.
2D) “Simple Simple Effect” (Intercept Meaning Under Centering)
In the “mean-centered” model we specified above, the intercept is predicted Price for: Buyer_Income = 0 AND Num_Dependents = mean(Num_Dependents).
summary(m_ss_mean)So with our model, for a family with 1.93 dependents and no annual income, we would predict a purchase price of a house of $483,200… but, like… what’s the point of such a nonsense prediction?
That “Income = 0” is NOT literally meaningful, so let’s center Income too to make the intercept interpret as Price at average Income and average Dependents.
inc_mean <- mean(housing$Buyer_Income)
housing$Inc_c <- housing$Buyer_Income - inc_mean
m_int_centered <- lm(Price ~ Inc_c * ND_c_mean, data = housing)
summary(m_int_centered)
confint(m_int_centered)With the model specified in this way, the output can be interpreted as:
- Intercept = predicted Price at average Income and average Dependents.
- Coef on Inc_c = simple slope of Income at average Dependents.
- Coef on ND_c = simple slope of Dependents at average Income.
- Product term = same interaction test as before (coding does not change significance).
2E) Optional Visualization
Two fitted lines at low vs. high Dependents:
# Create a little grid of Buyer_Income values
x_seq <- seq(min(housing$Buyer_Income), max(housing$Buyer_Income), length.out = 60)
# Build two data frames: one at ND = z_lo, one at ND = z_hi
nd_lo_df <- data.frame(Buyer_Income = x_seq,
Num_Dependents = rep(z_lo, length(x_seq)))
nd_hi_df <- data.frame(Buyer_Income = x_seq,
Num_Dependents = rep(z_hi, length(x_seq)))
# Predict using the interaction model
yhat_lo <- predict(m_int_dep_inc, newdata = nd_lo_df)
yhat_hi <- predict(m_int_dep_inc, newdata = nd_hi_df)
# Quick base R plot
plot(housing$Buyer_Income, housing$Price,
pch = 16, cex = 0.7,
xlab = "Buyer Income", ylab = "Price",
main = "Simple Lines: Income at low vs high Num_Dependents")
lines(x_seq, yhat_lo, lwd = 2, lty = 2) # low dependents
lines(x_seq, yhat_hi, lwd = 2) # high dependents
legend("topleft", bty = "n",
legend = c(paste0("Low Dependents (~", round(z_lo, 2), ")"),
paste0("High Dependents (~", round(z_hi, 2), ")")),
lty = c(2, 1), lwd = 2)Part 3: Quick Demo When the Interaction Isn’t There
Foreshadowing: It’s going to be a lot of work for little new info—hence we would typically stick with the additive/non-interactive model.
Research question: Does the effect of Bedrooms depend on Bathrooms?
3A) Reminder (from Part 1)
summary(m_add_bedbath)
summary(m_int_bedbath) # interaction not significant3B) Simple Slopes at +/- 1 SD Bathrooms
To show they’re basically the same:
# We defined these in Part 1, but here they are again for reference:
bath_mean <- mean(housing$Bathrooms, na.rm = TRUE)
bath_sd <- sd(housing$Bathrooms, na.rm = TRUE)
bath_lo <- bath_mean - bath_sd
bath_hi <- bath_mean + bath_sd
# Center Bathrooms at each target and refit
housing$Bath_c_mean <- housing$Bathrooms - bath_mean
m_BB_mean <- lm(Price ~ Bedrooms * Bath_c_mean, data = housing)
housing$Bath_c_lo <- housing$Bathrooms - bath_lo
m_BB_lo <- lm(Price ~ Bedrooms * Bath_c_lo, data = housing)
housing$Bath_c_hi <- housing$Bathrooms - bath_hi
m_BB_hi <- lm(Price ~ Bedrooms * Bath_c_hi, data = housing)
# Compare the simple slopes of Bedrooms across these three models
coef(summary(m_BB_mean))["Bedrooms", ]
coef(summary(m_BB_lo))["Bedrooms", ]
coef(summary(m_BB_hi))["Bedrooms", ]You should find the Bedrooms slope is very similar at low/mean/high Bathrooms. That matches the non-significant interaction: the lines are (approximately) parallel.
Part 4: Your Turn — A Significant Interaction
Price ~ Buyer_Age Credit_Score*
Task overview (do these steps and write up results):
4A) Fit the additive vs. interaction models:
- Report PRE, F, and p for the one-df interaction comparison.
- Point to the interaction row in
summary()and connect it to your manual F.
4B) Simple slopes for the effect of Buyer_Age at:
- Credit_Score = mean
- Credit_Score = mean - 1 SD
- Credit_Score = mean + 1 SD
- Credit_Score = 600
- Credit_Score = 700
- Credit_Score = 800
Implement by centering Credit_Score at each target value and fitting: Price ~ Buyer_Age * Credit_Score_centered. In each model, report the simple slope of Buyer_Age. Interpret the pattern in plain language (who shows a stronger age-price link?).
4C) “Simple simple effect” (intercept meaning):
- Center BOTH Buyer_Age and Credit_Score at their means, then fit
Price ~ Age_c * Score_c - Interpret the intercept (Price at average Age and average Score).
- Interpret the main-effect lines as simple slopes at the means.
I’m going to go ahead and recommend that you take a stab at writing this code on your own “from scratch” (by which I mean probably copying and pasting from sample code above and adjusting so that it is running the test specified, not that you have to write with zero references).
I’m going to supply you with some helper code that gets you most of the way towards answering about half of the questions above below, but I would recommend you attempt on your own as practice for what you’ll need to do on the final exam.
# Fit additive and interaction
m_add_age_sc <- lm(Price ~ Buyer_Age + Credit_Score, data = housing)
m_int_age_sc <- lm(Price ~ Buyer_Age * Credit_Score, data = housing)
summary(m_add_age_sc)
summary(m_int_age_sc)
# PRE / F / p for adding the interaction
SSE_add_AS <- sum(residuals(m_add_age_sc)^2)
SSE_int_AS <- sum(residuals(m_int_age_sc)^2)
PRE_int_over_add_AS <- (SSE_add_AS - SSE_int_AS) / SSE_add_AS
PRE_int_over_add_AS
df_small_AS <- df.residual(m_add_age_sc)
df_big_AS <- df.residual(m_int_age_sc)
F_int_AS <- ((SSE_add_AS - SSE_int_AS) / (df_small_AS - df_big_AS)) / (SSE_int_AS / df_big_AS)
p_int_AS <- pf(F_int_AS, df1 = (df_small_AS - df_big_AS), df2 = df_big_AS, lower.tail = FALSE)
F_int_AS
p_int_AS
# Setup for simple slopes (center Score at mean, mean±SD)
sc_mean <- mean(housing$Credit_Score, na.rm = TRUE)
sc_sd <- sd(housing$Credit_Score, na.rm = TRUE)
sc_lo <- sc_mean - sc_sd
sc_hi <- sc_mean + sc_sd
# At mean Score
housing$SC_c_mean <- housing$Credit_Score - sc_mean
m_SS_mean <- lm(Price ~ Buyer_Age * SC_c_mean, data = housing)
summary(m_SS_mean); confint(m_SS_mean)
coef(summary(m_SS_mean))["Buyer_Age", ]
# At low Score
housing$SC_c_lo <- housing$Credit_Score - sc_lo
m_SS_lo <- lm(Price ~ Buyer_Age * SC_c_lo, data = housing)
summary(m_SS_lo); confint(m_SS_lo)
coef(summary(m_SS_lo))["Buyer_Age", ]
# At high Score
housing$SC_c_hi <- housing$Credit_Score - sc_hi
m_SS_hi <- lm(Price ~ Buyer_Age * SC_c_hi, data = housing)
summary(m_SS_hi); confint(m_SS_hi)
coef(summary(m_SS_hi))["Buyer_Age", ]
# Center both for "simple simple effect" interpretation of the intercept
housing$Age_c <- housing$Buyer_Age - mean(housing$Buyer_Age, na.rm = TRUE)
housing$Score_c <- housing$Credit_Score - sc_mean
m_int_centered_AS <- lm(Price ~ Age_c * Score_c, data = housing)
summary(m_int_centered_AS); confint(m_int_centered_AS)