I’ll get straight to it. I love Strictly. You may know it by “Dancing with the Stars”, “Bailando por un Sueño”, or “Danse avec les stars”. I resisted for a decade, but for the last two years I’ve been fully seduced by the big glitter ball and haven’t missed an episode.
Now as a fully qualified arm-chair critic, I think I’ve noticed that to get a good score on Strictly you just need to do the Charleston. It hides a multitude of sins. You sort of jump about a bit, do something called “swivel”, and fall over at the end. Then you get a minimum 35. Job done.
My family disagree, so I’ve set about to test my hypothesis. The great news is that this is really easy to do with R. In fact, this whole post uses no more R than you’d find in an introduction to R course. My family can’t wait to find out the results. Wherever they went.
So let’s get started and finally put this to rest. Base R has everything we need but we’ll get there quicker with the tidyverse. As most of my notebooks now start, let’s load the main tidyverse packages.
library(tidyverse) # I've set message=FALSE everywhere
The Data
We’re lucky that someone at www.ultimatestrictly.com has been keeping track of every dance that’s ever happened on strictly. And we’re even luckier that they have made it available as a CSV download. It only goes up to 2016 but that’s still 14 series so it should be enough.
With the URL all we need are a couple of tweaks to the default read_csv
to account properly for an unusual NA string and some sparse columns that guess the data type incorrectly.
url <- "https://www.ultimatestrictly.com/s/SCD-Results-S14.csv"
raw_results <- read_csv(file = url, na = c("-"), guess_max = 10000)
head(raw_results) %>%
knitr::kable() # pretty markdown tables
Couple | Dance | Song | Series | Week | Order | Craig | Arlene | Len | Bruno | Alesha | Darcey | Jennifer | Donny | Total |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Natasha & Brendan | Cha cha cha | Chain Of Fools | 1 | 1 | 1 | 5 | 7 | 8 | 7 | NA | NA | NA | NA | 27 |
Lesley & Anton | Waltz | He Was Beautiful | 1 | 1 | 2 | 6 | 8 | 8 | 7 | NA | NA | NA | NA | 29 |
Chris & Hanna | Cha cha cha | Lady Marmalade | 1 | 1 | 3 | 4 | 4 | 7 | 4 | NA | NA | NA | NA | 19 |
Jason & Kylie | Waltz | Three Times A Lady | 1 | 1 | 4 | 5 | 5 | 6 | 5 | NA | NA | NA | NA | 21 |
Verona & Paul | Cha cha cha | R.E.S.P.E.C.T | 1 | 1 | 5 | 7 | 6 | 7 | 7 | NA | NA | NA | NA | 27 |
Claire & John | Waltz | Unchained Melody | 1 | 1 | 6 | 7 | 7 | 8 | 5 | NA | NA | NA | NA | 27 |
We get every dance, for every week in every series, and individual judges scores as well as the total score.
Let’s start with a little light cleaning. The formatting on the dances is a bit inconsistent. For example we have Cha cha cha
and Cha Cha Cha
. Most of it can be fixed by forcing the dance names to be a consistent case. I like the look of title case so we’ll use stringr’s str_to_title
.
dances <- raw_results %>%
mutate(Dance = str_to_title(Dance))
dances %>%
select(Dance) %>%
head() %>%
knitr::kable()
Dance |
---|
Cha Cha Cha |
Waltz |
Cha Cha Cha |
Waltz |
Cha Cha Cha |
Waltz |
My next step would usually be to gather this into a tidy long data frame, maybe separate the couples into celebrity and professional. But in this instance we only want the Dance and the Total, so we’ll leave it like it is.
Top Dances
I’d like to know which dances get the best Total
score. We can do this with a group_by
, summarise
combination.
dances %>%
group_by(Dance) %>%
summarise(MeanScore = mean(Total, na.rm = TRUE),
Count = n()) %>%
head(3)
## # A tibble: 3 x 3
## Dance MeanScore Count
## <chr> <dbl> <int>
## 1 American Smooth 31.6 102
## 2 Argentine Tango 35.7 43
## 3 Cha Cha 32 1
I’m removing anything with less than 10 scores because these tend to be hybrid dances and one-offs.
filter(Count >= 10) %>%
The Show Dance only appears in the final episode and is multi-style. Out it goes.
filter(Dance != "Show Dance") %>%
I don’t want any “hops” in there. The hops are those ones where everyone’s on the floor at the same time and you have no idea what’s going on but they all seem to be enjoying themselves. There’s only a Lindy Hop in this data but I don’t want any hops. Ever. No Hops!
filter(!str_detect(Dance, " Hop")) %>%
Putting it all together we get our top dances.
top_dances <- dances %>%
group_by(Dance) %>%
summarise(MeanScore = mean(Total, na.rm = TRUE),
Count = n()) %>%
filter(Count >= 10) %>%
filter(Dance != "Show Dance") %>%
filter(!str_detect(Dance, " Hop")) %>%
arrange(desc(MeanScore))
knitr::kable(top_dances)
Dance | MeanScore | Count |
---|---|---|
Argentine Tango | 35.67442 | 43 |
Viennese Waltz | 32.10000 | 90 |
Charleston | 31.74324 | 74 |
American Smooth | 31.57843 | 102 |
Quickstep | 30.71538 | 130 |
Foxtrot | 30.05738 | 122 |
Samba | 29.46296 | 108 |
Paso Doble | 29.21552 | 116 |
Tango | 28.89552 | 134 |
Salsa | 28.72549 | 102 |
Jive | 28.26016 | 123 |
Rumba | 28.17273 | 110 |
Waltz | 27.93233 | 133 |
Cha Cha Cha | 25.44805 | 154 |
And there we have it. I’m sort of right. The Charleston is top 3. And who doesn’t like an Argentine Tango? But there’s something about that count… What if they tend to only do the Charleston in later weeks when they’re a bit better?
We know that the scores go up by week. In fact it’s pretty linear.
dances %>%
ggplot(aes(x = Week, y = Total, group = Week)) +
geom_boxplot()
Let’s do the same again and add average week. This will give the mean week in which each dance appears. A bigger number will mean they tend to appear later.
top_dances_week <- dances %>%
group_by(Dance) %>%
summarise(MeanScore = mean(Total, na.rm = TRUE),
MeanWeek = mean(Week, na.rm = TRUE),
Count = n()) %>%
filter(Count >= 10) %>%
filter(Dance != "Show Dance") %>%
filter(!str_detect(Dance, " Hop")) %>%
arrange(desc(MeanScore))
knitr::kable(head(top_dances_week, 5))
Dance | MeanScore | MeanWeek | Count |
---|---|---|---|
Argentine Tango | 35.67442 | 10.023256 | 43 |
Viennese Waltz | 32.10000 | 6.911111 | 90 |
Charleston | 31.74324 | 6.891892 | 74 |
American Smooth | 31.57843 | 7.333333 | 102 |
Quickstep | 30.71538 | 5.823077 | 130 |
Rats. My hypothesis is in trouble. The Charleston does indeed have a high MeanWeek
which means it appears later in the series when the scores are higher. My children have long since gone to bed so let’s take it up a notch and build a statistical model that can account for Week
and Dance
at the same time.
Linear Model
First, we’ll take the individual dance level data and reuse the filtering for the top_dances
.
main_dances <- dances %>%
filter(Dance %in% top_dances$Dance)
We want a simple linear model that accounts for the increasing score by week and the effect we’re interested in – the dance.
fit <- lm(Total ~ Week + Dance, data = main_dances)
summary(fit)
##
## Call:
## lm(formula = Total ~ Week + Dance, data = main_dances)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.9515 -3.4593 0.4912 3.5637 19.4965
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.37653 0.59660 39.183 < 2e-16 ***
## Week 1.11844 0.04051 27.607 < 2e-16 ***
## DanceArgentine Tango 1.08747 0.95628 1.137 0.25564
## DanceCha Cha Cha -1.87934 0.68459 -2.745 0.00612 **
## DanceCharleston 0.65854 0.79808 0.825 0.40941
## DanceFoxtrot 0.16272 0.70368 0.231 0.81716
## DanceJive -1.25416 0.70372 -1.782 0.07492 .
## DancePaso Doble -1.22840 0.71043 -1.729 0.08399 .
## DanceQuickstep 0.82609 0.69385 1.191 0.23400
## DanceRumba -1.78227 0.72064 -2.473 0.01350 *
## DanceSalsa -1.36169 0.73365 -1.856 0.06364 .
## DanceSamba -1.83586 0.72150 -2.545 0.01104 *
## DanceTango -0.13999 0.69274 -0.202 0.83988
## DanceViennese Waltz 0.99380 0.75585 1.315 0.18877
## DanceWaltz -0.03570 0.70003 -0.051 0.95934
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.225 on 1526 degrees of freedom
## Multiple R-squared: 0.3969, Adjusted R-squared: 0.3914
## F-statistic: 71.75 on 14 and 1526 DF, p-value: < 2.2e-16
After removing the strong effect of week, there’s not much left of significance. It can be tempting to cherry pick individual levels but the only one with much signal is the Cha Cha Cha. Ultimate Strictly describes it as “A cheeky, fun dance, but rarely a show stopper”.
Conclusion
Unfortunately it looks like the evidence doesn’t support my hypothesis that the Charleston is more favourably marked. Maybe all that swivelling is harder than it looks. Faye and Giovanni did undeniably do an excellent job with their, perfect scoring, Sound of Music Charleston. What we did find is that the Cha Cha Cha might score lower. Although that’s most likely because it’s seen as an easier dance for, well, People like Quentin.
I’m sorry Quentin. I would fare no better.
Well seeing as I’m wrong, and everyone else is asleep. Let’s just pretend this never happened.