Ordinary Least Squares (OLS) is the workhorse of regression analysis, providing Best Linear Unbiased Estimators (BLUE) under the Gauss-Markov assumptions. However, when error distributions deviate from normality—particularly when they exhibit skewness—OLS estimates, while still unbiased, lose statistical efficiency. This means:
Consider scenarios where non-normal errors are common:
In these contexts, relying solely on OLS can lead to suboptimal inference.
The Polynomial Maximization Method of order 2 (PMM2) is a moment-based estimation technique that leverages higher-order moments of residuals to achieve:
PMM2 works by constructing a system of polynomial equations based on central moments:
\[ Z_j(\beta) = \sum_{i=1}^{n} x_{ij} \left[ A \cdot \hat{y}_i^2 + B \cdot \hat{y}_i + C \right] = 0 \]
where:
The algorithm iteratively solves this system using a gradient-based approach, starting from OLS estimates.
Reference: Zabolotnii et al. (2018) - “Polynomial Estimation of Linear Regression Parameters for the Asymmetric PDF of Errors” (Springer, AUTOMATION 2018)
Let’s explore PMM2 with practical examples using the
EstemPMM package.
We’ll simulate data where errors follow a skewed distribution (using \(\chi^2\)) to demonstrate PMM2’s advantages.
# Sample size
n <- 200
# Generate predictor
x <- rnorm(n, mean = 5, sd = 2)
# True parameters
beta_0 <- 10
beta_1 <- 2.5
# Generate highly skewed errors using chi-squared distribution
# (chi-square with df=3 has skewness ~ 1.63)
errors <- rchisq(n, df = 3) - 3 # Center to have mean ~ 0
# Response variable
y <- beta_0 + beta_1 * x + errors
# Create data frame
dat <- data.frame(y = y, x = x)# Fit OLS to get residuals for visualization
fit_ols_temp <- lm(y ~ x, data = dat)
residuals_ols <- residuals(fit_ols_temp)
# Plot histogram of residuals
par(mfrow = c(1, 2))
hist(residuals_ols, breaks = 30, main = "Residual Distribution",
xlab = "Residuals", col = "lightblue", border = "white")
qqnorm(residuals_ols, main = "Q-Q Plot")
qqline(residuals_ols, col = "red", lwd = 2)par(mfrow = c(1, 1))
# Calculate skewness and kurtosis
cat(sprintf("Skewness: %.3f\n", pmm_skewness(residuals_ols)))
#> Skewness: 1.865
cat(sprintf("Excess Kurtosis: %.3f\n", pmm_kurtosis(residuals_ols) - 3))
#> Excess Kurtosis: 1.710The histogram and Q-Q plot clearly show right skewness, violating the normality assumption that underpins OLS optimality.
# Fit PMM2 model
fit_pmm2 <- lm_pmm2(y ~ x, data = dat, verbose = FALSE)
# Fit OLS model for comparison
fit_ols <- lm(y ~ x, data = dat)
# Display coefficients
cat("=== OLS Estimates ===\n")
#> === OLS Estimates ===
print(coef(fit_ols))
#> (Intercept) x
#> 10.188412 2.423676
cat("\n=== PMM2 Estimates ===\n")
#>
#> === PMM2 Estimates ===
print(coef(fit_pmm2))
#> [1] 9.980738 2.465210
cat("\n=== True Values ===\n")
#>
#> === True Values ===
cat("Intercept:", beta_0, "\n")
#> Intercept: 10
cat("Slope: ", beta_1, "\n")
#> Slope: 2.5To rigorously compare estimator efficiency, we use bootstrap inference:
# Perform bootstrap inference
boot_results <- pmm2_inference(fit_pmm2, formula = y ~ x, data = dat,
B = 500)
# Display summary with confidence intervals
summary(boot_results)
#> Estimate Std.Error t.value p.value conf.low
#> Min. :2.465 Min. :0.06404 Min. :27.25 Min. :0 Min. :2.348
#> 1st Qu.:4.344 1st Qu.:0.13959 1st Qu.:30.06 1st Qu.:0 1st Qu.:4.072
#> Median :6.223 Median :0.21514 Median :32.87 Median :0 Median :5.796
#> Mean :6.223 Mean :0.21514 Mean :32.87 Mean :0 Mean :5.796
#> 3rd Qu.:8.102 3rd Qu.:0.29069 3rd Qu.:35.68 3rd Qu.:0 3rd Qu.:7.520
#> Max. :9.981 Max. :0.36624 Max. :38.49 Max. :0 Max. :9.244
#> conf.high
#> Min. : 2.597
#> 1st Qu.: 4.615
#> Median : 6.634
#> Mean : 6.634
#> 3rd Qu.: 8.652
#> Max. :10.670The bootstrap results provide:
Key Observation: PMM2 standard errors are typically 10-30% smaller than OLS when errors are skewed, indicating higher precision.
PMM2 extends naturally to multiple predictors.
# Generate multiple predictors
n <- 250
x1 <- rnorm(n, mean = 0, sd = 1)
x2 <- runif(n, min = -2, max = 2)
x3 <- rexp(n, rate = 0.5)
# True model
beta <- c(5, 1.2, -0.8, 2.0) # Intercept, x1, x2, x3
errors <- rt(n, df = 4) # t-distribution with heavy tails
y <- beta[1] + beta[2]*x1 + beta[3]*x2 + beta[4]*x3 + errors
# Create data frame
dat_multi <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3)
# Fit models
fit_pmm2_multi <- lm_pmm2(y ~ x1 + x2 + x3, data = dat_multi, verbose = FALSE)
fit_ols_multi <- lm(y ~ x1 + x2 + x3, data = dat_multi)
# Compare coefficients
comparison <- data.frame(
Parameter = c("Intercept", "x1", "x2", "x3"),
True = beta,
OLS = coef(fit_ols_multi),
PMM2 = coef(fit_pmm2_multi)
)
print(comparison, row.names = FALSE)
#> Parameter True OLS PMM2
#> Intercept 5.0 4.9484846 4.939106
#> x1 1.2 1.2531710 1.249675
#> x2 -0.8 -0.8080797 -0.792286
#> x3 2.0 2.0101585 2.014457PMM2 handles polynomial terms seamlessly.
# Generate data with quadratic relationship
n <- 200
x <- seq(-3, 3, length.out = n)
errors <- rchisq(n, df = 5) - 5
# True quadratic model: y = 2 + 3x - 0.5x^2
y <- 2 + 3*x - 0.5*x^2 + errors
dat_poly <- data.frame(y = y, x = x)
# Fit polynomial model
fit_pmm2_poly <- lm_pmm2(y ~ x + I(x^2), data = dat_poly, verbose = FALSE)
# Summary
summary(fit_pmm2_poly, formula = y ~ x + I(x^2), data = dat_poly)
#> PMM2 estimation results
#> Call:
#> lm_pmm2(formula = y ~ x + I(x^2), data = dat_poly, verbose = FALSE)
#>
#> Coefficients:
#> [1] 2.0897269 2.8725555 -0.4779162
#>
#> Central moments of initial residuals:
#> m2 = 10.05743
#> m3 = 37.45091
#> m4 = 489.7792
#>
#> Theoretical characteristics of PMM2 (S = 2):
#> c3 = 1.174172
#> c4 = 1.842015
#> g = 0.6411572 (expected ratio Var[PMM2]/Var[OLS])
#>
#> Algorithm information:
#> Convergence status: TRUE
#> Iterations: 3
#>
#> Approximate statistical inference via bootstrap (B=100):
#> Estimate Std.Error t.value p.value conf.low conf.high
#> 1 2.0897269 0.28555149 7.318214 2.513545e-13 1.5799690 2.6122483
#> 2 2.8725555 0.09797060 29.320586 0.000000e+00 2.6892140 3.0583990
#> 3 -0.4779162 0.06130577 -7.795616 6.439294e-15 -0.5863997 -0.3492109PMM2 provides standard diagnostic methods:
The diagnostic plots include:
Based on simulation studies documented in the package:
| Model | Sample Size | MSE(PMM2) / MSE(OLS) | Variance Reduction |
|---|---|---|---|
| MA(1) | 200 / 1000 | 0.85 / 0.77 | 15-23% |
| MA(2) | 200 / 1000 | 0.68 / 0.74 | 26-32% |
| ARMA(1,1) | 200 / 1000 | 0.70 / 0.55 | 30-45% |
| ARIMA(1,1,1) | 300 / 1000 | 0.76 / 0.48 | 24-52% |
These ratios less than 1.0 confirm that PMM2 achieves lower Mean Squared Error compared to classical methods (CSS/OLS) when errors deviate from normality.
Use PMM2 when:
Stick with OLS when:
pmm2_inference()) for robust confidence intervalsTo learn more about EstemPMM:
vignette("02-pmm2-time-series") for AR, MA, ARMA, and ARIMA
modelsvignette("03-bootstrap-inference") for detailed statistical
inference proceduresThe PMM2 method provides a powerful alternative to OLS when error distributions deviate from normality. By leveraging higher-order moments, PMM2 achieves:
The EstemPMM package makes PMM2 accessible through an
intuitive interface that mirrors standard R modeling functions like
lm().
Zabolotnii, S., Warsza, Z.L., Tkachenko, O. (2018). “Polynomial Estimation of Linear Regression Parameters for the Asymmetric PDF of Errors”. In: Automation 2018. Advances in Intelligent Systems and Computing, vol 743. Springer, Cham. https://doi.org/10.1007/978-3-319-77179-3_75
Package documentation: ?lm_pmm2,
?pmm2_inference, ?compare_with_ols
GitHub repository: https://github.com/SZabolotnii/EstemPMM