Kororareka’s stats for the Times Crossword 2008

Here are the results of Kororareka’s work on the extent of repetition in Times crossword answers. Apart from some editing for style and length, the information below is his work, not mine. I’ve added a few comments of my own in italics.

I tallied the daily and Saturday Times crossword for the calendar year 2008; a total of 313 crosswords. There were 9187 clues altogether – an average of 29.35 per puzzle [this is pretty much as you’d expect from the set of grids – see April 2008 Grids article]. 8085 distinct entries (words or phrases) were used as solutions. (Words like ABJURED and ABJURING were counted as two distinct words – counting roots seemed too difficult without a fair amount of human intervention.) The number of times each entry appeared as a solution is shown in the table below. An overwhelming majority (some 88%) of the entries appeared only once, which is probably testament to the setters’ skills. That said, 946 words appeared more than once, with 131 appearing more than twice.

Value Frequency Relative Frequency Cumulative Frequency Cumulative Relative Frequency
1 7139 0.8830 7139 0.8830
2 815 0.1008 7954 0.9838
3 110 0.0136 8064 0.9974
4 17 0.0021 8081 0.9995
5 4 0.0005 8085 1.0000

Those which appeared 5 times were: ROD, TEA, TIE, and TOPIC. TEA was also included in many compound words – TEA GARDEN, TEA LEAF, etc. It is a fitting accolade to a national obsession.

The words which appeared 4 times were: ADO, APPLE, APSE, ARCHIVIST, ARENA, ASCOT, ASSET, AWARD, , ERATO, GENTRY, INFERNO, LEGIT, OMANI, PEA, RELAPSE, RUT, SEA. There are some words here that I wouldn’t have suspected, but others such as ERATO, OMANI and APSE I would have declared regulars. SEA and PEA also appear in compound expressions, but do not rival TEA.

The list of thrice-used entries appears below. It contains some of the usual suspects and some which are surprising, such as FAHRENHEIT and INDONESIA. The first phrases start to appear, such as GET UP AND GO.

ADHERENT ADVANTAGE ALLOA ALOFT ALUMNUS AMBLE ANDROMEDA ANNEAL ANON ANVIL
ASPIDISTRA ATTEST BETA CABIN CANON CRACK DEGREE DETRIMENT DOMICILE DOWNTOWN
DRESSER EGO EIDER ELITE EMMY EQUIP ERA ERRATUM EVADE EXTOL
FAHRENHEIT FELT GALILEO GASOHOL GET UP AND GO GOO GRANDSON HANDS DOWN ICON IDLE
IDYLL ILL IMPASTO INCA INDONESIA IRON KNIGHT LACONIC LADEN LEA
LEARN LEONARDO LIE LOTTO LULU MASTHEAD MASTODON MORIBUND NEIGHBOUR NIGHT
NOEL NONPLUS NOSE OAR OATH OP ART OPERA ORANGE ORIGINATE ORMER
PALATE PALMA PAR PATTERN PIKE PLEASANTRY POSER PRIME RALLYING REALISTIC
RESERVE RHINO RHONE RIOJA ROMEO RUBY SEASONED SHYSTER SINK SOHO
STAG STUD STUN TABLE TADPOLE TAHITI THETA THROW TITAN TO WIT
TORRENT TRACE TRINITY TRIPPER TRUMP TRUMPED- UP TUB TWEETER VEIN ZERO

My theory is that most of the frequent words are just good grid-fillers, so it should be no surprise that they appear often. No doubt they appear just as regularly in the Sunday Times and Jumbos or the Guardian for that matter, so if you’re doing puzzles from multiple sources their frequency is multiplied, and their hackneyed reputation earned. In fact, there’s an analysis of Guardian puzzles by one of the fifteensquared bloggers, which shows that the list of most-used words in those puzzles (for a 10-year period) is not that similar to the Times “thrice or more” list for 2008.  This probably means that the Times list for 2009 will be quite different too.

Of course, a word being hackneyed doesn’t imply the same for the clue to it. Setters no doubt come up with alternative clues for words and store the better ones for later use.

I was surprised to find words such as GONER appearing on a Friday and again on the following Monday, so I began to wonder if the reoccurrence of words was a purely random process or if there was some form of self regulation or editorial intervention. Without getting too technical, by considering the distribution of interarrival times I convinced myself (but probably not Sir J.F.C.Kingman) that the hand of the editor was evident.  Peter took the more pragmatic approach of actually asking – My understanding from Richard Browne, Times Crossword Editor, is that he works quite hard to reduce repetition of words in the puzzle, and will tell the setters that he’s noticed particular words coming up frequently, for example.

I’m impressed by the lack of repetition that this analysis shows – it means that once you’ve seen a phrase like MIDDLE OF THE ROAD, you’ll be unlucky to see it more than once again in the same year.  We should probably remember, when grumbling about a plant like the recent MANZANITA, that we may have been saved from another MARGARITA.

There’s similar analysis available for the New York Times puzzle, covering a much longer time period, at http://www.xwordinfo.com/.  If you remember the rules about US-style crossword grids, it’s no surprise that they have much more repetition – over the period of about 16 years covered, there are currently 344 words which have been used 100 times or more – very roughly equivalent to the 5-timers above.

6 comments on “Kororareka’s stats for the Times Crossword 2008”

  1. Impressive analysis, indeed! However it doesn’t do much to address the issue of repetitiveness or cliche because there are so many ways to do it.

    One of my own favourites, for example, is ELI – apparently, the only priest in the bible! I reckon to see him crop up monthly or thereabouts, in one crossword or another and in one form or another, usually as part of a word and so perhaps not caught in the above..

    One day I will set a crossword consisting entirely of cliches. Eli will be in 1ac, and Erato and muse will be there somewhere too…

    1. You’re quite right. I think we can all forgive answers recurring from time to time, particularly if they are clued inventively, but stale clueing is a little more difficult to forgive (and for a computer to detect). It does feel at times that if Eli had not existed it would be necessary to invent him.

      As for my own project, I entend to write a stage show with dialogue and music consisting entirely of Times crossword clues. The working title for this year’s production is Abednego of Abu Simbel: The Abstergent Acupuncturist featuring the hit songs A Shropshire Lad and Abide With Me. As you can see, it still needs a bit of work.

      1. I confirmed with a keyword search of the blog that we’ve recorded four uses of priest=>ELI since we started this version of Times for the Times on 1 Nov 2006 – a total of just over 700 daily Times puzzles. This isn’t infallible as some bloggers might count a clue with Eli in it as an easy one to leave out, but I think it supports my impression that the Times puzzle is quite good at keeping the clichés to a minimum.
        1. I can’t argue with the facts. And the overall high standard of clueing in the Times is why I do it in preference to some home grown ones, which can’t seem to achieve the same level of wit and erudition.
        2. Not really disputing that.. just saying that “stats” and “facts” may be limited tools in this context.

          for example another one of my cliche list, the world’s only ancient city, UR, raised its ugly head again today (11feb).. I forget, 3dn was it? UpriseR .. would that have been recorded as a stat or a fact, I wonder?

          1. We seem to have had a recent rash of Ur – it appeared in 24146, 24133 and 24123. Before that, the next ones I can find are 23816 and 23614. But one of the recent times, the next clue effectively used the same def for ALEPPO, and I’m sure I’ve seen ancient city = Thebes once or twice.

Comments are closed.