Phil Karlton famously said there are only two hard things in Computer Science: cache invalidation and naming things. I gather many folks suppose "naming things" is about whether to use camel-case or not, or picking specific symbols we use to name things, which is obviously trivial and mundane. But I always assumed Karlton meant the problem of making references work: the task of relating intension (names and ideas) and extension (things designated) in a reliable way, which is also the same topic as cache invalidation when that's about when to stop the association once invalid.
https://web.archive.org/web/20130805122711/http://lambda-the...
It was actually about coming up with short but sufficiently expressive names that greatly improve code legibility, so certainly names that would sufficiently well represent the intent and avoid confusion.
But I am not sure how it relates to "stopping association once invalid"? Is this about renaming things when the names are not suitable anymore? That's only a very special case of "naming is hard", but I do believe naming is hard even when starting from scratch (having seen many a junior engineer come up with what they thought were descriptive names, but which mostly explained how their understanding developed over time and not the synthesised understanding once they had it fully).
1) cache invalidation
2) naming things
0) off-by-one
Most, since, many devs name by what it does, rather than how it might be found.
For example naming a function calculateHaversine won't help someone looking for a function that calculates the distance between 2 latlongs unless they know the haversine does that.
Or they default to shortest name. Atan, asin, Pow for example.
If you want to synthesize this type of knowledge on the fly because you don't like learning other people's conventions, just feed the docs to chatgpt and ask if there's a function that solves your problem.
This is why a formal education is so important, and why books like "gang of Four" are some sort of standard. They've given a name to some common patterns, allowing a more efficient form of communication and higher level of thinking. Are the patterns actually good? Are the names actually good? That is besides the point.
f[<Intension|Extension>] == 0
Just in case: https://en.wikipedia.org/wiki/Bra%E2%80%93ket_notation#Hermi...
More seriously.. https://en.wikipedia.org/wiki/Binding_(linguistics)
And its derivatives in CS
Etc
TL:DR "naming things" itself was the joke all along.
Not being able to make a good guess is a lack of understanding of problem domain, nicely rolled up into this catch all term.
But on the other side, there is administration. I work on projects with names like FRPPX21, in category PX23, same idea for tax forms, and just about everything administrative. Should I write code with variable names like that, I would be yelled at by the unfortunate guy who gets to read it.
Also — “Jevon’s paradox”. That one is nasty! For example: just about anything we do to decrease use of fossils fuels by some small percent, makes the corresponding process more efficient, more profitable, and thus happen more. That’s a nasty nasty problem. I guess it’s not specific to computer science, but all engineering.
>Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
1.) Caching things
3.) Race conditions
2.) Naming things
4.) Off-by-one errors
1) According to Tom Bajzek on Phil Karlton's son's blog, the saying goes back to Phil Karlton's time at CMU in the 1970s: https://www.karlton.org/2017/12/naming-things-hard/#comment-...
2) How did Phil recognize this difficulty around cache invalidation before he even entered the workforce (going to Xerox PARC, DEC, SGI, Netscape)?
Answer: as a grad student, he was contributing to discussions around the discussion of the Hydra filesystem being designed at CMU at that time. The following 1978 paper credits discussions with him by name, which is probably a good hint where he learned about the difficulties of cache invalidation: https://dl.acm.org/doi/pdf/10.5555/800099.803221 He started out more interested in the math side of things perhaps, https://dl.acm.org/doi/pdf/10.1145/359970.359989
3) Also mildly coincidental to me is that one of SGI's core technical accomplishments in its waning years (about the time Phil left them for Netscape so he likely was not personally involved; I don't know) was dealing with memory caching in highly scalable single-system-image SMP (symmetric multiprocessing) servers when you go from 16+ CPU SMPs to a memory subsystem needing to support 512-1024 CPUs...
Answer: you have to
A) make the memory non-uniform (non-"symmetric") in it's latency to different CPUs (NUMA: (https://en.wikipedia.org/wiki/Non-uniform_memory_access)), and
B) invent new ways of handling the resulting cache coherency problems to mask the fact that some CPUs have closer access to memory than other CPUs to keep the programming model more like SMP and less like pure separate-memory clustering.
Here's a paper outlining how that was done in 1998: https://courses.cs.washington.edu/courses/cse549/07wi/files/... which in turn was based on Stanford's FLASH multiprocessor work: https://dl.acm.org/doi/pdf/10.1145/191995.192056
This cache-coherent NUMA (ccNUMA) technique went on to be used in AMD Opteron, Itanium, and Xeon SMP systems till this vary day.