Setup
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.5.1
## Warning: package 'tibble' was built under R version 4.5.1
## Warning: package 'purrr' was built under R version 4.5.1
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
meta_2010_2024 <- readRDS("C:/Users/cmor7/OneDrive/Desktop/Projects/nfl_predictions/data/meta_2010_2024.rds")
Does the moon_fraction correlate with spread_outcome?
# Step 1: Filter for games where the spread was missed
spread_miss <- meta_2010_2024 %>%
select(game_id, moon_fraction, spread_outcome, total_line_accuracy,
total_score, away_mileage, wax_wan) %>%
filter(spread_outcome == "Miss")
# Step 2: Count misses by moon fraction thresholds
counts <- spread_miss %>%
summarise(
less_than_05 = sum(moon_fraction < 0.5, na.rm = TRUE),
ge_05 = sum(moon_fraction >= 0.5, na.rm = TRUE),
less_than_025 = sum(moon_fraction < 0.25, na.rm = TRUE),
ge_075 = sum(moon_fraction >= 0.75, na.rm = TRUE)
)
# Display raw counts
print(counts)
## less_than_05 ge_05 less_than_025 ge_075
## 1 1021 1017 676 690
# Step 3: Chi-Square Test for <50% vs >=50%
chi_50 <- chisq.test(c(counts$less_than_05, counts$ge_05))
print(chi_50)
##
## Chi-squared test for given probabilities
##
## data: c(counts$less_than_05, counts$ge_05)
## X-squared = 0.0078508, df = 1, p-value = 0.9294
# Step 4: Chi-Square Test for <25% vs >=75%
chi_25_75 <- chisq.test(c(counts$less_than_025, counts$ge_075))
print(chi_25_75)
##
## Chi-squared test for given probabilities
##
## data: c(counts$less_than_025, counts$ge_075)
## X-squared = 0.14348, df = 1, p-value = 0.7048
# Extract p-values for reporting
p_50 <- round(chi_50$p.value, 4)
p_25_75 <- round(chi_25_75$p.value, 4)
cat("\nChi-square test (<50% vs >=50%): p =", p_50, "\n")
##
## Chi-square test (<50% vs >=50%): p = 0.9294
cat("Chi-square test (<25% vs >=75%): p =", p_25_75, "\n")
## Chi-square test (<25% vs >=75%): p = 0.7048
Commentary: moon_fraction does not seem to correlate with
spread_outcome. I filtered the data for games where the spread set by
Las Vegas was inaccurate, a “Miss”, and then counted the number of
occurences when the moon phase was less than half and more than half as
well as less than quarter and greater than three-quarters. The
distrubutions of occurences was roughly equal regardless of
moon_fraction.
Does moon_fraction have any correlation with total_score?
score_effect <- meta_2010_2024 %>%
select(game_id, moon_fraction, spread_outcome, total_line_accuracy,
total_score, away_mileage, wax_wan) %>%
filter(total_score >= 54 | total_score < 36) # 36 is 1st qu., 45 is med, and54 is 3rd qu. score per game in dataset
ggplot(score_effect, aes(x=moon_fraction, y=total_score)) +
geom_point(size = 2) +
geom_smooth(method = "lm", se = TRUE) +
labs(
title = "Moon Fraction vs Total Score",
x = "Moon Fraction",
y = "Total Score"
) +
theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

ggplot(score_effect, aes(x = moon_fraction, y = total_score)) +
geom_bin2d(bins = 16, color = "white", alpha = 0.8) +
scale_fill_gradient(low = "yellow", high = "red") +
labs(
title = "Moon Fraction vs Total Score",
x = "Moon Fraction",
y = "Total Score",
fill = "Count"
)

Commentary: This chunk examines the relationship between moon
fraction and total game scores by filtering for extreme scores (above
the 3rd quartile or below the 1st quartile of total scores in the
dataset, where 36 is the 1st quartile, 45 median, and 54 the 3rd
quartile). It produces two plots: a scatter plot with a linear
regression trend line and confidence interval to visualize potential
correlations, and a 2D histogram (heat map) showing density of data
points at different moon fractions and total scores, using 16 bins per
axis with yellow-to-red color gradient. The analysis finds no
significant correlation between moon fraction and total score, as
evidenced by the flat trend line and random distribution in the plots,
consistent with the overall investigation’s conclusion of no moon phase
impact on NFL outcomes.
Does moon_fraction correlate with total_line_accuracy?
total_line_miss <- meta_2010_2024 %>%
select(game_id, moon_fraction, spread_outcome, total_line_accuracy,
total_score, away_mileage, wax_wan) %>%
filter(total_line_accuracy >= 9 | total_line_accuracy <= -8) # -8 is 1st qu and 9 is 3rd qu. total line accuracy per game in dataset
ggplot(total_line_miss, aes(x=moon_fraction, y=total_line_accuracy)) +
geom_point(size = 2) +
geom_smooth(method = "lm", se = TRUE) +
labs(
title = "Moon Fraction vs Total Line Accuracy",
x = "Moon Fraction",
y = "Total Line Accuracy"
) +
theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

ggplot(total_line_miss, aes(x = moon_fraction, y = total_line_accuracy)) +
geom_bin2d(bins = 16, color = "white", alpha = 0.8) +
scale_fill_gradient(low = "yellow", high = "red") +
labs(
title = "Moon Fraction vs Total Line Accuracy",
x = "Moon Fraction",
y = "Total Line Accuracy",
fill = "Count"
)

Commentary: This chunk investigates moon fraction’s potential
correlation with Vegas total line accuracy by isolating extreme
inaccuracies (accuracy values above the 3rd quartile or below the 1st
quartile, where -8 is the 1st quartile and 9 the 3rd quartile). It
generates two visualizations: a scatter plot with points, linear smooth
trend line, and confidence intervals; and a 2D histogram heatmap with 16
bins, colored from yellow (low count) to red (high count). The results
show no discernible pattern or correlation between moon fraction and
total line accuracy, with data points evenly scattered and trend line
near horizontal, reinforcing the finding that moon phases do not
influence betting line predictions in NFL games.
Does multivariate filtering reveal moon_fraction correlations?
multi_var <- meta_2010_2024 %>%
select(game_id, moon_fraction, spread_outcome, total_line_accuracy,
total_score, away_mileage, wax_wan) %>%
filter(total_score >= 54 | total_score < 36) %>%
filter(total_line_accuracy >= 9 | total_line_accuracy <= -8) # 36 is 1st qu., 45 is med, and 54 is 3rd qu. score per game in dataset
ggplot(multi_var, aes(x=moon_fraction, y=total_line_accuracy)) +
geom_point(size = 2) +
geom_smooth(method = "lm", se = TRUE) +
labs(
title = "Moon Fraction vs Total Line AND Score",
x = "Moon Fraction",
y = "Total Line Accuracy"
) +
theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

ggplot(multi_var, aes(x = moon_fraction, y = total_line_accuracy)) +
geom_bin2d(bins = 16, color = "white", alpha = 0.8) +
scale_fill_gradient(low = "yellow", high = "red") +
labs(
title = "Moon Fraction vs Total Line AND Score",
x = "Moon Fraction",
y = "Total Line Accuracy",
fill = "Count"
)

Commentary: This multivariate analysis builds on previous chunks by
applying both extreme total score filters (as in Chunk 3) and extreme
total line accuracy filters (as in Chunk 4) simultaneously to the
dataset. It then plots the filtered data: first, a scatter plot of moon
fraction vs. total line accuracy with linear trend and confidence band;
second, a 2D histogram heatmap with 16 bins. Despite combining filters
for more pronounced effects, no meaningful correlations emerge from moon
fraction to either total line accuracy or total score, as the plots
display random distributions without clear patterns or sloped
trends.
Does statistical testing confirm a lack of moon_fraction
correlations?
# Correlation test: moon_fraction vs total_score
cor_test_score <- cor.test(meta_2010_2024$moon_fraction, meta_2010_2024$total_score)
print(cor_test_score) # Outputs r and p-value
##
## Pearson's product-moment correlation
##
## data: meta_2010_2024$moon_fraction and meta_2010_2024$total_score
## t = -0.25562, df = 4076, p-value = 0.7983
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.03469309 0.02669313
## sample estimates:
## cor
## -0.004003747
# Correlation test: moon_fraction vs total_line_accuracy
cor_test_accuracy <- cor.test(meta_2010_2024$moon_fraction, meta_2010_2024$total_line_accuracy)
print(cor_test_accuracy)
##
## Pearson's product-moment correlation
##
## data: meta_2010_2024$moon_fraction and meta_2010_2024$total_line_accuracy
## t = 0.55369, df = 4076, p-value = 0.5798
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.02202720 0.03935539
## sample estimates:
## cor
## 0.008672263
# Chi-squared for spread_outcome vs binned moon_fraction
meta_2010_2024 <- meta_2010_2024 %>%
mutate(moon_bin = cut(moon_fraction, breaks = c(0, 0.25, 0.5, 0.75, 1),
labels = c("New", "Waxing", "Waning", "Full")))
chi_test <- chisq.test(table(meta_2010_2024$spread_outcome, meta_2010_2024$moon_bin))
print(chi_test) # p-value to check independence
##
## Pearson's Chi-squared test
##
## data: table(meta_2010_2024$spread_outcome, meta_2010_2024$moon_bin)
## X-squared = 0.76447, df = 6, p-value = 0.993
### Commentary: Further confirmation that there is no correlations
between the moon phase and line accuracy or total scores.
Do linear regressions show moon_fraction effects?
# Simple linear model for total_score
lm_score <- lm(total_score ~ moon_fraction + away_mileage + wax_wan, data = meta_2010_2024)
summary(lm_score) # Check coefficients, R-squared, p-values
##
## Call:
## lm(formula = total_score ~ moon_fraction + away_mileage + wax_wan,
## data = meta_2010_2024)
##
## Residuals:
## Min 1Q Median 3Q Max
## -42.169 -9.425 -0.894 8.655 58.981
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 45.6107114 1.2885946 35.396 <2e-16 ***
## moon_fraction 0.0018741 0.6786827 0.003 0.998
## away_mileage 0.0002049 0.0001507 1.360 0.174
## wax_wannew 0.9669131 1.6450238 0.588 0.557
## wax_wanwane -0.9750857 1.1549307 -0.844 0.399
## wax_wanwax -0.1516362 1.1540501 -0.131 0.895
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.9 on 4072 degrees of freedom
## Multiple R-squared: 0.0018, Adjusted R-squared: 0.0005742
## F-statistic: 1.468 on 5 and 4072 DF, p-value: 0.1967
# For total_line_accuracy, filtered to extremes
lm_accuracy <- lm(total_line_accuracy ~ moon_fraction + away_mileage + wax_wan,
data = filter(meta_2010_2024, abs(total_line_accuracy) >= 8))
summary(lm_accuracy)
##
## Call:
## lm(formula = total_line_accuracy ~ moon_fraction + away_mileage +
## wax_wan, data = filter(meta_2010_2024, abs(total_line_accuracy) >=
## 8))
##
## Residuals:
## Min 1Q Median 3Q Max
## -49.422 -13.672 7.964 14.422 39.608
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.141e+00 2.169e+00 -0.987 0.324
## moon_fraction 1.204e-01 1.144e+00 0.105 0.916
## away_mileage -9.792e-05 2.559e-04 -0.383 0.702
## wax_wannew -2.028e+00 2.819e+00 -0.719 0.472
## wax_wanwane 2.210e+00 1.941e+00 1.139 0.255
## wax_wanwax 7.917e-01 1.947e+00 0.407 0.684
##
## Residual standard error: 17.4 on 2230 degrees of freedom
## Multiple R-squared: 0.003438, Adjusted R-squared: 0.001204
## F-statistic: 1.539 on 5 and 2230 DF, p-value: 0.1745
# Visualize residuals to check model fit
plot(lm_score, which = 1) # Residuals vs fitted

Does wax/wane moon phase affect outcomes?
# Faceted scatter plot
ggplot(meta_2010_2024, aes(x = moon_fraction, y = total_score, color = wax_wan)) +
geom_point() +
geom_smooth(method = "lm") +
facet_wrap(~ wax_wan) +
labs(title = "Total Score vs Moon Fraction by Wax/Wane") +
theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

# Boxplot of total_line_accuracy by moon_bin and wax_wan
ggplot(meta_2010_2024, aes(x = moon_bin, y = total_line_accuracy, fill = wax_wan)) +
geom_boxplot() +
labs(title = "Total Line Accuracy by Moon Phase Bin and Wax/Wane")

Commentary: This chunk explores the wax/wane phases of the moon as
categorical variables (derived from moon_fraction) in relation to game
outcomes. The first plot is a faceted scatter plot of total score
vs. moon fraction, colored and smooth-lined per wax/wane phase, showing
no intersecting trends. The second is a boxplot of total line accuracy
grouped by binned moon phase (New, Waxing, Waning, Full) and filled by
wax/wane status, revealing symmetric distributions across categories.
Overall, no significant associations are detected between wax/wane moon
phases and total scores or line accuracies, with boxplots showing
similar medians and spread, consistent with the absence of lunar
influences on NFL results.
Is moon_fraction associated with overtime games?
# Filter for overtime games and summarize by moon_fraction
overtime_games <- meta_2010_2024 %>% filter(overtime == 1)
summary(overtime_games$moon_fraction) # Compare mean/median to non-overtime
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 0.110 0.450 0.465 0.790 1.000
# T-test: Is moon_fraction different in overtime vs non-overtime?
t.test(moon_fraction ~ overtime, data = meta_2010_2024)
##
## Welch Two Sample t-test
##
## data: moon_fraction by overtime
## t = 1.8181, df = 277.07, p-value = 0.07014
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.003447302 0.086731009
## sample estimates:
## mean in group 0 mean in group 1
## 0.5066214 0.4649796
# Plot total_score by week, filled by binned moon_fraction (discrete)
ggplot(meta_2010_2024, aes(x = factor(week), y = total_score, fill = moon_bin)) +
geom_boxplot() + labs(title = "Total Score by Week, Filled by Binned Moon Fraction")

Commentary: Focusing on overtime games, this code first filters the
dataset for overtime occurrences (overtime == 1), summarizes moon
fraction statistics (mean, median), and compares them to non-overtime
games via a t-test to check for differences. The second plot is a
boxplot of total scores by game week, filled by binned moon phase. The
summary shows moon fraction distributions in overtime are statistically
indistinguishable from regular games (non-significant t-test p-value),
and the boxplot indicates no systematic variation in scores based on
moon phase bins. This supports the broader finding of no lunar
correlations, as overtime occurrences appear random with respect to moon
phases.
Do k-means clusters reveal patterns involving moon_fraction?
cluster_data <- meta_2010_2024 %>% select(moon_fraction, total_score, total_line_accuracy) %>% na.omit()
kmeans_result <- kmeans(cluster_data, centers = 3)
cluster_data$cluster <- factor(kmeans_result$cluster)
ggplot(cluster_data, aes(x = moon_fraction, y = total_score, color = cluster)) + geom_point()

Commentary: This applies k-means clustering (with 3 centers) to a
subset of numeric variables (moon_fraction, total_score,
total_line_accuracy) after removing NA values, assigning each data point
to a cluster. The resulting scatter plot colors points by cluster,
visualizing potential groupings in the relationship between moon
fraction and total score. However, the clusters do not align with moon
fraction in a way that suggests meaningful separations related to game
outcomes; data points are mixed across clusters without lunar-driven
patterns, further confirming the lack of correlations observed
throughout the investigation.
Do normalized plots show moon_fraction effects?
# First, compute density of moon_fraction (for normalization)
moon_density <- meta_2010_2024 %>%
mutate(moon_bin = cut(moon_fraction, breaks = seq(0, 1, by = 0.0625), include.lowest = TRUE, right = TRUE)) %>%
count(moon_bin) %>%
rename(total_count = n)
# For score_effect: Join and normalize
normalized_score <- score_effect %>%
mutate(moon_bin = cut(moon_fraction, breaks = seq(0, 1, by = 0.0625), include.lowest = TRUE, right = TRUE)) %>%
count(moon_bin, total_score) %>%
left_join(moon_density, by = "moon_bin") %>%
mutate(relative_density = n / total_count)
# Plot normalized 2D histogram (use relative_density for fill)
ggplot(normalized_score, aes(x = moon_bin, y = total_score, fill = relative_density)) +
geom_bin2d(stat = "identity") +
scale_fill_gradient(low = "yellow", high = "red") +
labs(title = "Normalized: Moon Fraction vs Total Score",
x = "Moon Fraction Bin", y = "Total Score", fill = "Relative Density") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))

# Plot moon_fraction by season to see yearly variation
ggplot(meta_2010_2024, aes(x = factor(season), y = moon_fraction)) +
geom_boxplot() +
labs(title = "Moon Fraction Distribution by Season", x = "Season", y = "Moon Fraction") +
theme_minimal()

Commentary: To address potential biases in distribution, this
normalizes the earlier score extreme plot by moon fraction density. It
creates moon bins of 0.0625 width (matching the 16 bins), counts
occurrences per bin and total score, then divides by total counts to get
relative density. The tile plot visualizes this normalized density, and
a boxplot shows moon fraction variation by season. While revealing some
seasonal patterns in moon phase distribution (possibly due to calendar
shifts), the normalized plot still shows no concentrated relationships
between moon fraction and total scores, as densities are relatively
uniform, ruling out lunar effects that were initially suspected.
Does season start alignment affect moon_fraction distribution?
# Assuming start_date_time is the first game date per season
meta_by_season <- meta_2010_2024 %>%
group_by(season) %>%
summarise(start_date = min(start_date_time, na.rm = TRUE))
meta_by_season$moon_at_start <- meta_2010_2024$moon_fraction[match(meta_by_season$start_date, meta_2010_2024$start_date_time)]
ggplot(meta_by_season, aes(x = factor(season), y = moon_at_start)) +
geom_boxplot() +
labs(title = "Moon Fraction at Season Start", x = "Season", y = "Moon Fraction")

Commentary: This identifies the earliest game date per season and
matches corresponding moon fraction at season start, then boxplots these
values by season. The plot shows variability in starting moon fractions
across years but no consistent trends, and the boxplots indicate similar
distributions without extreme outliers driving the data. This analysis
confirms that seasonal alignment of NFL schedules with moon phases does
not introduce biases that would affect correlations; instead, the lack
of identifiable patterns per season reinforces the conclusion that moon
phases are unrelated to NFL performance metrics.
Investigating Moon Phase and Home/Away Advantage
# Define moon phase bins using moon_fraction and wax_wan
meta_2010_2024 <- meta_2010_2024 %>%
mutate(moon_phase = case_when(
moon_fraction < 0.05 ~ "New Moon",
moon_fraction > 0.95 ~ "Full Moon",
wax_wan == "wax" & moon_fraction < 0.5 ~ "Waxing Crescent",
wax_wan == "wax" & moon_fraction >= 0.5 ~ "Waxing Gibbous",
wax_wan == "wan" & moon_fraction > 0.5 ~ "Waning Gibbous",
TRUE ~ "Waning Crescent"
),
moon_phase = factor(moon_phase, levels = c("New Moon", "Waxing Crescent", "Waxing Gibbous", "Full Moon", "Waning Gibbous", "Waning Crescent")),
home_win = ifelse(result > 0, 1, ifelse(result < 0, 0, NA))
)
overall_home_win_rate <- meta_2010_2024 %>%
filter(!is.na(home_win)) %>%
summarize(home_win_rate = mean(home_win)) %>%
pull(home_win_rate)
summary_stats <- meta_2010_2024 %>%
filter(!is.na(home_win)) %>%
group_by(moon_phase) %>%
summarize(
home_win_rate = mean(home_win),
avg_score_diff = mean(result),
n_games = n(),
se = sqrt(home_win_rate * (1 - home_win_rate) / n_games),
.groups = 'drop'
) %>%
mutate(
lower = home_win_rate - 1.96 * se,
upper = home_win_rate + 1.96 * se
) %>%
add_row(
moon_phase = "Overall",
home_win_rate = overall_home_win_rate,
avg_score_diff = mean(meta_2010_2024$result, na.rm = TRUE),
n_games = sum(!is.na(meta_2010_2024$home_win)),
se = NA, lower = NA, upper = NA
)
print(summary_stats)
## # A tibble: 6 × 7
## moon_phase home_win_rate avg_score_diff n_games se lower upper
## <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
## 1 New Moon 0.563 2.37 520 0.0217 0.521 0.606
## 2 Waxing Crescent 0.589 2.39 721 0.0183 0.554 0.625
## 3 Waxing Gibbous 0.534 1.75 771 0.0180 0.499 0.570
## 4 Full Moon 0.558 1.93 550 0.0212 0.517 0.600
## 5 Waning Crescent 0.555 2.20 1504 0.0128 0.530 0.580
## 6 Overall 0.559 2.13 4066 NA NA NA
ggplot(summary_stats %>% filter(moon_phase != "Overall"), aes(x = moon_phase, y = home_win_rate)) +
geom_col(fill = "steelblue", alpha = 0.8) +
geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.2, color = "black") +
geom_hline(yintercept = overall_home_win_rate, linetype = "dashed", color = "red", size = 1) +
geom_text(aes(label = sprintf("%.1f%%", home_win_rate * 100)), vjust = -0.5, color = "white", size = 3.5, fontface = "bold") +
annotate("text", x = 1, y = overall_home_win_rate + 0.015, label = "Overall: 55.9%", color = "red", hjust = 0, size = 4) +
labs(title = "Home Win Rate by Moon Phase", x = "Moon Phase", y = "Home Win Rate") +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
plot.title = element_text(size = 14, face = "bold")
)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

ggplot(meta_2010_2024, aes(x = moon_phase, y = result)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Home Score Differential by Moon Phase", x = "Moon Phase", y = "Home Score Differential (Result)") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))

logit_model <- glm(home_win ~ moon_phase, data = meta_2010_2024 %>% filter(!is.na(home_win)), family = binomial)
summary(logit_model)
##
## Call:
## glm(formula = home_win ~ moon_phase, family = binomial, data = meta_2010_2024 %>%
## filter(!is.na(home_win)))
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.25522 0.08842 2.886 0.0039 **
## moon_phaseWaxing Crescent 0.10651 0.11640 0.915 0.3602
## moon_phaseWaxing Gibbous -0.11752 0.11415 -1.030 0.3032
## moon_phaseFull Moon -0.02144 0.12325 -0.174 0.8619
## moon_phaseWaning Crescent -0.03357 0.10252 -0.327 0.7433
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 5580.3 on 4065 degrees of freedom
## Residual deviance: 5575.6 on 4061 degrees of freedom
## AIC: 5585.6
##
## Number of Fisher Scoring iterations: 4
Commentary: This analysis categorizes games into moon phase bins
based on moon_fraction and wax_wan status, then computes home win rates
and average score differentials (result for home minus away) for each
phase, alongside the overall home win rate across all games. The summary
statistics and bar plot show that home win rates vary slightly by moon
phase (e.g., around 56% for New Moon and overall), but with no clear
pattern or substantial deviations from the overall rate of approximately
56%. The dashed red line in the bar plot highlights this consistency,
indicating that any variations are likely due to random chance rather
than a systematic effect. The boxplot of score differentials reinforces
this, with similar distributions across phases centered around a
positive home advantage of 2-3 points, without notable outliers or
shifts. The logistic regression model predicting home win probability
from moon phase yields non-significant coefficients (p > 0.05 for all
phases relative to New Moon), confirming no statistically detectable
influence of moon phases on home/away outcomes. This aligns with
previous findings in the notebook, suggesting that moon phases do not
introduce biases or affect NFL performance metrics, and any perceived
relationships are coincidental.