I absolutely love reading about hard problems that are invisible to most people.
The usual technique was to start by holding up a color card on the stage/floor then use a vectorscope[1] and get all the dots to line up in the right place. Then with a waveform monitor for exposure. During the event, there would be fine tuning by eye, or as things drifted out of line.
[1] https://en.wikipedia.org/wiki/Vectorscope#/media/File:PAL_Ve...
PS: You can also see modern vectorscope / waveform monitor images in this photo from the cyanview blog. Look for the black and white X-ray looking things on the screens. https://www.cyanview.com/wp-content/uploads/2022/10/20221006...
If you want the green of the grass on all the pylon cameras to match your main production cameras, adjustments are a must. And with outdoor stadiums, this is a constant task—lighting conditions change throughout the day, even when a cloud moves across the sky. When night falls, video engineers are working non-stop to keep everything perfectly aligned with the main cameras.
Funny the only kind of picture that I don't color grade are sports photos because I don't want to mess up the color of the jerseys, though if I was careful in how I did it, it would be OK.
I have been struggling to develop a reliable process for making red-cyan anaglyphs and one step of the process would be a color grade that moves colors away from reds and cyans that would all be in one eye or the other eye. I've got to figure out how to make my own LUT cubes to do it.
Perhaps they could read the article prior to asking because some of those questions might be answered in the article?
This example helped, I think. Thanks!
I do not believe this. If you look around, there are many non-issues being sold as real problems[1], and people buy it. People buy all sorts of crap, that is just consumerism in effect. If you did sales, you probably know this. Same thing with "bullshit jobs". Perhaps "sustainable" is the keyword here, but I am not so sure about that either.
[1] Snake-oils comes to mind. Pretty flourishing business.
I worked a startup that had decent tech, but a shit product. Wasn't focused enough to really solve clients' issues. Maybe alleviated some issues, but also introduced more. It was disliked by the people who actually had to use it. But our sales guy was really good at convincing those peoples' bosses that it would make the company more money.
It was a total top-down sales approach. Throw a bunch of buzzwords at the founder/CFO/boss, they force it on the people actually doing the work. I hated it, and it worked so well that fixing the product was never a priority. It was always new "features" to slap on more buzzwords to the sales pitch. I really think it could've been a good product, too!
As we were able to very quickly respond to customer demands for anything special that they would need, they ended-up being our main sales channel by recommending the solution further. And nearly 10 years after, we're still pretty much on the same model, trying to keep up with the developments, delivering products and supporting our customers. The website is outdated and it's been years we're trying to make any progress there, eventually we'll succeed at that.
Operating such high visibility events like the Olympics sounds pretty nerve-wracking. How much of an issue is security for you? Do you experience any attacks?
With the rise of remote production (where the control room is located at headquarters while cameras and microphones are on-site at stadiums), broadcasters are implementing VPNs, private fiber connections, and other methods to stay largely separate from the public internet.
In our case, the only part that uses the public internet is the relay server, which is necessary when working over cellular networks. Security is one of the main reasons we haven’t expanded this service into a full cloud portal yet—it’s much easier to secure a lightweight data relay with no database, running on a single port, than to lock down a larger, more complex system.
So even if someone would be able to break into the server through the small attack surface, he would not be able to change any setting on any of our customer's devices. Or even read any status either. Of course, if someone can break into our server, the DoS is inevitable, but so far this never happened.
Sounds like the entertainment industry. Everyone truly knows everyone, especially when you're working on the same show with the same crew year after year.
It's definitely a family of sorts.
Regardless, just keep making quality software that sells itself!
The interesting part is that the main marketing and sales is by word-of-mouth and quality of product. All the hardware is not even on the website, which was very confusing to my understanding when writing. It makes sense under the resource constraints.
https://elixir-lang.org/blog/categories.html#Elixir%20in%20P...
It's pretty much just appeal to authority. "These people are successful and they used Elixir, why don't you?"
I can spill all the juicy details as the main author and instigator.
Cyanview reached out to me to help find a dev a while back. Hearing about their customers I knew it would be a decently big splash for Elixir. I was surprised that they were unknown and had this succcess with big household name clients.
I like them. I like their whole deal. Small team, punching above their weight. Hardware, software, FPGAs and live broadcasts. The story has so much to it. David and team have been great sports in sharing their story.
Fundamentally I care more about Elixir adoption though, I reached out to the Elixir team and offered to interview them and write something up.
A case study about successful Elixir production deployments is definitely content marketing. But for Elixir. It is a very common question when mentioning a less common language. "Who uses this?" I thought it was a very interesting case. Glad to have it documented. The style of a case study won't suit everyone.
I suppose "without any marketing, before _this_" would have been funny.
We use MQTT a lot, it is really a central piece of our architecture, but Elixir brings a lot of benefits regarding the handling of many processes which are often loosely coupled. The BEAM and OTP offer a sane approach to concurrency and Elixir is a nice language on top. Here is what I find the most important benefits:
- good process isolation, even the heap is per process. This allows us to have robust and mature code running along more experimental features without the fear of everything going down. And you still have easy communication between processes
- supervision tree allows easy process management. I also created a special supervisor with different restart strategies. The language allows this and then, it integrates as any other supervisor. With network connections being broken and later reconnected, the resilience of our system is tested regularly, like a physical chaos monkey
- the immutability as implemented by the BEAM greatly simplifies how to write concurrent code. Inside a process, you don't need to worry about the data changing under you, no other process can change your state. So no more mutex/critical sections (or very little need). You can still have deadlock though, so it is not a silver bullet
I work at the university and we build acquisition systems with exotic cameras and screens, do you think we could meet one time to discuss possible (commercial and research) projects ?
One of the next steps would be to have a real cloud portal where we could remotely access cameras, manage and shade them from the portal itself. In this context we have been advised to look at NATS. Remote production or REMI is now getting more traction and some of our clients handle 60+ games at the same time from a central location. That definitely creates new challenges as centralizing everything for control is a need but keeping distributed processes and hardware is key to keep the whole system up if one part fails.
Feel free to contact me, details in profile.
At first, we used an erlang lib emqtt, but it was left unmaintained and then removed from github. We had to switch to something else. Not completely happy, but it works for us
I will have a look at this new package, it looks promising
As much as I am a critic of the system, if this is your use case, this is out-of-the-box a very strong foundation for what you need to get done.
For anyone interested in the video stream itself, here's a summary. On-site, everything is still SDI (HD-SDI, 3G-SDI, or 12G-SDI), which is a serial stream ranging from 1.5Gbps (HD) to 12Gbps (UHD) over coax or fiber, with no delay. Wireless transmission is typically managed via COFDM with ultra-low latency H.264/H.265 encoders/decoders, achieving less than 20ms glass-to-glass latency and converting from/to SDI at both ends, making it seamless.
SMPTE 2110 is gaining traction as a new standard for transmitting SDI data over IP, uncompressed, with timing comparable to SDI, except that video and audio are transmitted as separate independent streams. To work with HD, you need at least 10G network ports, and for UHD, 25G is required. Currently, only a few companies can handle this using off-the-shelf IT servers.
Anything streamed over the public internet is compressed below 10 Mbps and comes with multiple seconds of latency. Most cameras output SDI, though some now offer direct streaming. However, SDI is still widely used at the end of the chain for integration with video mixers, replay servers, and other production equipment.
AIUI, technically, the old phone switches worked the same way. BEAM handled all the metadata and directed the hardware that handled the phone call data itself, rather than the phone call data directly passing through BEAM. In 2025 it would be perfectly reasonable to handle the amount of data those switches dealt with in 2000 through BEAM, but even in 2025, and even with voice data, if you want to maximize your performance for modern times you'd still want actual voice data to be handled similarly to how you handle your video streams, for latency reliability reasons. By great effort and the work of tons of smart people, the latency sensitivity of speech data is somewhat less than it used to be, but one still does not want to "spend" your latency budget carelessly, and BEAM itself is only best-effort soft realtime.
All programming languages can do any task. It's about how easy they make that task for you.
For instance, Elixir supports compilation targeting GPUs (within exactly the same language, not a fork).
Most languages do not allow that (and for most it would be fairly hard to implement).
From the article:
> “Yes. We’ve seen what the Erlang VM can do, and it has been very well-suited to our needs. You don’t appreciate all the things Elixir offers out of the box until you have to try to implement them yourself.
In every case, like the engineering team in this article demonstrates, the developer experience and end results have exceeded expectations. If you haven’t used Elixir, you should give it a try.
Edit: Fixed an editing error.
That's what I love about Elixir, but it means that selling it is more like convincing a developer who knows and uses CSV to switch to Postgres. There's a ton of advantages to storing data in a relational DB instead of flat files, but now you have to define a schema up front, deal with table and row locking, figure out that VACUUM thing, etc.
When you're just setting out to learn a new language, trying to understand a new OS on top hurts adoption.
Unless someone who knows those three languages is curious or encounters a particular problem that motivates them to explore, they're unlikely to pick up an immutable, functional language.
I have never looked back.
Elixir is an absolute joy to use. It simplifies multi-threaded programming, pattern-matching makes code easier to understand and maintain, and it is magnitudes faster to code in than Java. For me, Elixir’s version of functional programming provides the ease of development that OOP promised and failed to deliver.
In my opinion, Elixir is software engineering’s best kept secret.
As an example, we just rolled out a feature in our cloud offering that allows a user to remotely call a robot to a specified waypoint inside a facility, and show real time updates of the robot's position on its map of the world as it navigates there. We did this with just MQTT, LiveView, Phoenix PubSub, and a very small amount of JS for map controls. The cloud portion of this feature was basically built in 2-3 weeks by one person (minus some pre-existing code for handle displaying raw map PNGs from S3, existing MQTT ingress handling, etc.).
Of course you _can_ do things like this with other languages. However, the core language features are just so good that, for our use cases, it blows the other choices out of the water.
I don’t think there’s any evidence whatsoever that you would catch runtime bugs sooner with Gleam than with Elixir (or Erlang). Erlang’s record for reliability is stronger than many statically typed languages, including even Java.
There is a certain class of errors static types can prevent but there’s a much larger set of those it can’t. To make the case for a language like TS/Java/Swift/Golang or Gleam actually resulting in fewer runtime defects than Erlang or Elixir, I’d want to see some real world data.
Maybe you can go into this more, but I don't really understand what that means, what is this larger set of runtime errors that can't be prevented by static typing?
I use a bit of Elixir, and I'd say most of the errors I'm facing at runtime are things like "(FunctionClauseError) no function clause matching", which is not only avoidable in Gleam, but actually impossible to write without dipping into FFI.
I'm excited for more static typing to come into Elixir, as it stands I'm only really confident about my Elixir code when it has good test coverage, and even then I feel uneasy when refactoring. Still a fun language to use though.
- Logic errors
- Null or Undefined values (prevented in many newer languages)
- Out-of-bounds errors
- Concurrency-related issues
- Arithmetic errors (undefined operations, integer overflow, etc)
- Resource management errors
- I/O errors
- External system failures
- Unhandled exceptions (e.g., RuntimeException in Java)
If you use a language like Rust, you can get help from the type system on several of these points, but ultimately there's a limit to what type systems can do before becoming too complex.
I suppose we can have a mixed language project, with erlang, elixir and gleam. Not sure about the practicality of it though
As for the topics on MQTT, they function as a kind of universal API—at least internally. Some partners and customers are already using them to automate certain functions. However, we haven’t officially released anything yet, as we wouldn’t be able to guarantee stability or prevent changes at this stage.
I would say that unless you have a professional reason, there's very little benefit to the average end-user to do a deep dive into it. If your intention is to spend $7000 on a RED camera and then $13,000 on lenses, gimbal, cage, follow focus, matte box, memory cards etc to make a small and cost effective single camera production package, then by all means, dig into it.
Grading is the creative process of adding a look to your production, which is usually handled in post production but there are now ways to do it live, although by using similar tools as the post production software. And they still re-do it in post production. This is used live for concerts and fashion shows.
There is a significant distinction between shading and grading.
Shading is essential in the TV industry, where the goal is to ensure all cameras are perfectly matched in exposure, tone curve, and colors. This ensures seamless transitions between camera angles, maintaining consistency in skin tones, fine details, and the color of grass and sky. A crucial aspect of shading is accurately reproducing sponsor logos' colors, which can sometimes be the starting point as that's where the money comes from. Creativity plays a lesser role here, as the focus is on following industry standards such as ITU-R BT.709 for SDR or ITU-R BT.2020 and HLG for HDR.
Grading, on the other hand, is a creative process meant to give a distinctive look to a production . Traditionally done in post-production, it can also now be applied in real time using tools similar to those found in post-production software. Despite this, it is often still refined further in post. Live grading is commonly used for events such as concerts and fashion shows, where you want to look different from TV productions.
PS You might have pasted two different answer drafts above. Paras 1,4 and 2,5 deliver similar information
> The devices in a given location communicate and coordinate on the network over a custom MQTT protocol. Over a hundred cameras without issue on a single Remote Control Panel (RCP), implemented on top of Elixir’s network stack.
Makes sense! MQTT is, if I understand right, built on TCP. Idk if I would have found the same solution, but its seemingly a good one
Now the same products are used for very small productions that don't have the budget for any studio camera (look typically at 50k+ for a camera without lens). In that case we try to provide a similar user experience and functions but with much more ffordable cameras.
Finaly more and more live productions are now handled using cine style cameras which don't have the standard broadcast remote panels and that's another area we cover, by combining camera control with control of many external boxes like external motors to drive manual lenses or 3D Lut video processors. Applications are on fashion shows, concerts, theater, Churches, studio shows, even corporate.
In the end Elixir is used for a lot of small processes which handle very low level control protocols. And then on top of that add a high-level of communication between devices either on local networks or over the cloud.
Just out of curiosity, what would be examples of very small productions here? Would an independent YouTube channel with great production quality be using this?
One important point, if you are not live, then there's usually the possibility to adjust everyting manually on the camera and then finish in post production so our remotes are nearly never used outside the constraints of live productions.
One the opposite direction, I heard that they had around 250 cameras on Love Island but you can pretty much control everything from one or 2 remotes as there isn't a need for a lot of changes at a single time. The action only happens in front of a few of them. That said, we still have 250 processes running and controlling these cameras continuously.
I suppose FX30, FX3 and FX6 is in Sony's cinema line and may have all the color stuff that these systems want to tweak but I'm not sure. These cameras do get a fair bit of play on YouTube.
Let me get it out: I love BEAM. OTP is awesome and one of the best systems in its kind. I was completely enamored with Elixir years ago as a modern Erlang which excited me to the bone.
It’s no longer the case. When you get into non-trivial things there are many sharp edges and paper cuts. Some from the top of my head:
- it’s impossible to disable warnings - test runs are often highly verbose because libraries ignore them and (as discussed somewhere in forums) warnings are deemed useful so they can’t be disable
- the only way to catch some, important ones, in CI is to use "warnings-as-errors" …
- so one cannot use deprecation flags because it’s also a warning
- when having non trivial ecosystem one cannot selectively deprecate and get errors, this has to be a human process (remember to replace…)
- when doing umbrella tests on non compiled code seed influences order of compilation
- this order of compilation influences outcome and can lead to buggy code
- dependency compilation is not parallelized - takes a lot of time and uses 10% of CPU
- compilation process can break and Elixir isn’t aware that it was interrupted - this means that it doesn’t try to compile properly app/dependency but instead tries to release crippled one
- documentation is hard to use - searching shows random versions and there isn’t even a link to „open newest”
- searching in docs often not finds the keywords you can actually see
- a lot of knowledge is implicit (try checking if you can dynamically add children to a Supervisor)
- sidebar with new ExDocs break for some reason so there is no navigation
- there is no way to browse navigation outside this broken ExDocs which outputs only HTML and LSP
- LSP is afterthought, there are few but neither works well
- Dialyzer/dialyxyr typespecs are useless most of the time
- Squiggly arrow works weird (i.e. ~0.3 might catch 0.99) - my colleague recently mentioned Renovate not picking it up
I could go on and on. I’m doing plenty of research so I’m working with various languages including „niche” ones like Prolog, OCaml, Clojure, Cuelang. Recently I’ve been developing tooling in Go and many core systems are developed in Rust in Elixir, and I work on the latter often.
In principle Elixir is awesome, but has the worst developer experience of all. Sometimes it takes 4h to prepare and push release. Tooling I’m working on can do the same in 5 minutes - I’m parallelizing processes in containers, making idempotent output artifacts and heuristic failure detection to retry on flakiness. When switching between Go and OCaml you can sense how tooling cares for me and my time. Often I bounce off forums where people’s need are shrugged off as non-essential, treating those who came as uneducated juniors (because who in sane mind would like to have a parallel dependencies compilation or disable compilation time warning).
There is nothing better than BEAM, but (for me) Elixir got much worse over the years.
But. Honestly, I hate the hell out of how Elixir's docs look, and am pretty unhappy with how Erlang's docs started aping the style.
Seriously, check out the EBNF-esque description of the types for this function: <https://web.archive.org/web/20170509122932/http://erlang.org...>. Notice also how some of the documentation for all of the 'request' variants fits on a single screen. Scroll to the top of the page and notice the regular formatting and prominent, clear section headings. See how the significant data types used by the module are described in one place. Scroll to the bottom and observe the "See also" section. Notice the very clear navigation on the left-hand side that gives you obvious springboards to any part of the documentation.
Compare that to this: <https://www.erlang.org/doc/apps/inets/httpc.html#request/1>. Notice how you get a function or three of typespec on a screen. There's so much scrolling to get to the function's behavioral description. And the EBNF has been replaced with raw typespecs! If you understand how to read Erlang typespecs, it's totally possible to read the function type. But, like, if you're starting out with the language, this is WAY harder to read. Not to mention the loss of the clear headings at the top of the document and the centralized list of data types, as well as the "phone style" navigation widget on the left that obscures at least as much as it reveals.
Now I’m pretty hand with types, especially typescript types, managed to do some pretty complex stuff like using TS types to statically verify complex OpenAPI (swagger) apis on both client and server - basically re-implementing it all for compile time checking.
When I started using Elixer/Dialyzer types I would get into situations where I was like “this stupid error here! It doesn’t understand exactly what I’m trying to do and complains for no reason”. After delving deeper though I found that in 90% of cases it was actually a bug it was that I misunderstood/forgot something.
After that I stated respecting the dialyzer more. Hopefully with the new built in types it would be even better more user friendly.
I'm curious what you're working on with Elixir because my experience overall has not been the same.
For example our releases take 10 minutes to cut.
Or when I ask questions on Discord I tend to get answers pretty quickly.
It’s not a specific build but the whole thing, e.g. build takes 10 minutes, but test suite takes 10 minutes as well. Test suite can fail because of a bug or (more often) because of some race condition or build issue.
As I mentioned - today I’m working with Go. It’s nowhere near BEAM but it’s not critical, I never spent more than 15 minutes debugging Go race condition.
And yes, code is at fault, but I’d expect ecosystem to help fixing it, but we have none. E.g. circular dependendencies in umbrella. You can have them. You can print them. There is no warning. They result in inconsistent builds and 40s LSP check loops, during which I have zero access to documentation.
But if I use arrow for map I will get a warning and a compilation error.
Can you reproduce this in any way? Because I cannot:
mix new parent --umbrella
cd parent
mix new apps/foo
mix new apps/bar
Now change `Foo.hello` to call `Bar.hello` and vice-versa. When you run `mix compile`, you will get warnings like this: warning: Bar.hello/0 is undefined (module Bar is not available or is yet to be defined). Make sure the module name is correct and has been specified in full (or that an alias has been defined)
But of course, the `foo` and `bar` applications do not depend on each other, you can add explicit dependencies, such as `foo` depending on `bar` or `bar` depending on `foo`, but you always get warnings. And if you literally make it a dependency cycle, the app doesn't even boot: ** (Mix) Could not sort dependencies. The following dependencies form a cycle: foo, bar
Apps have to be compiled in order and one will by definition be compiled before the other, so it is really unclear how you could have those circular dependencies.But even then, let's say that somehow you have an undeclared and undefined cycle between `foo` and `bar`. The point of umbrella projects is that each app can be compiled in isolation, so you should be able to go to `bar` and compile it in isolation without `foo`, and if it is trying to invoke `foo` somehow, it will be made visible.
So yes, I would need a way to reproduce this, because there are warnings and tooling in place to deal with those. Thanks!
- There seems to be some confusion in relation to warnings. There are two types of warnings, compile-time warnings and runtime warnings. Compile-time warnings are emitted during compilation time and therefore should not affect test runs, aka, when you run code. Runtime warnings can be captured during tests, using `ExUnit.CaptureIO`. Deprecating modules and functions are compile-time warnings
- Indeed you need to enable `--warnings-as-errors` to halt compilation due to warnings in CI. Our philosophy here is to emit compilation warnings instead of compilation errors whenever possible, so you can run, debug, and test your code, instead of forcing your code to be pristine while you are still working on it. The focus here is precisely to provide a better developer experience. Then if you do want them to fail upfront, as in CI, you pass the flag
- "when doing umbrella tests on non compiled code seed influences order of compilation" - I am not sure what this means, sorry. Can you expand? But generally speaking our test framework randomizes test order by default, because you should not depend on order between tests or have dependencies between test files
- "dependency compilation is not parallelized" - when a dependency is compiled, the files in a dependency are parallelized, but not the dependencies themselves, so I'd very surprised if it only used 10% CPU before. In any case, a PR adding this feature was merged this week: https://github.com/elixir-lang/elixir/pull/14340. In my machine, compiling a project like Livebook uses 350% CPU without the flag above (showing some parallelism), and with the flag above set to 4, it is about 800% (250% + 250% + 150% + 150%). Note my machine has 8 performance cores and I don't get additional gains beyond 4 partitions
- "compilation process can break and Elixir isn’t aware that it was interrupted" - our tooling has code to deal precisely with this: https://github.com/elixir-lang/elixir/blob/c5c87a661efac6809.... If it still happens, it is a bug and must be reported, so we can fix it
- "try checking if you can dynamically add children to a Supervisor" - the top 4 results for "dynamically add children to a Supervisor in elixir" in Google and Ecosia lead to correct answers in the ElixirForum, StackOverflow, and the documentation. For completeness, I have also asked Claude, which gave a perfect answer (IMO): https://claude.ai/share/9d1e2ad4-2e43-4c32-a293-6fff32dd7001
- "there isn’t even a link to „open newest”" - this has been added to ExDoc, here is one example: https://hexdocs.pm/req/0.5.9/readme.html - but note we have always listed the versions in the sidebar
- "sidebar with new ExDocs break for some reason so there is no navigation" - please give an example, as this would be a bug and should be fixed, and I am not aware of any reports at the moment. Also note docs are available in the terminal, both inside `iex` or by doing `mix help SomeModule`
It's not a problem for people who are completely immersed and can remember most of the guidelines/policies/idiosyncracies. Getting new people on board or even following guildeilnes is hard when work process is interrupted. Some paper cuts are brought on Elixir Forums but I've seen them evaluated as not providing enough benefit to developers or being against design - and I found them through trying to solve very similar problem. I like some changes in devex direction (e.g. recent LSP initiative), yet I think it's lagging to other languages.
I often get cut by various - often small things - but there are so many. Disappointed is amplified by good experience with other (I'll give it to them - more popular or better funded) languages. Yet given very static download count of Jason on Hex I think that rarely new projects are started in Elixir, while Erlang's popularity is slowly but visibly growing, so I don't think I'm alone in my perception.
I will try to respond succintly to followup to not blow up already large text, so let me know if you'd like more info. https://imgur.com/a/iWbTEUf I've uploaded some weirdness I experienced in recent weeks/months.
> (ExDocs breaking) ... please give an example In screenshot - happened with Chromium,Safari and Firefox ~6 months ago. Often with OTEL libraries. Today I can't reproduce, but I also have DNS blackholing enabled. Maybe it's fixed or maybe that was analytics breaking on me.
> WRT: checking if you can dynamically add children to a Supervisor
Our case was bug caused by change of Dynamic to Normal (we had an app that would be replace in specific context, but otherwise should be supervised as usual). After that we started observing comm channel blocks due to dead connections - it was 77 1/2 bug: Line 77 shut down the child and Line 78 deleted the child, in 77 1/2 Supervisor restarted the child so it couldn't be deleted anymore - and it was able to pick some comm channels. It's not hard to fix, but one has to know that. I don't like "not recommended", as many things aren't recommended but we do it given circumstances. It's better to know the difference and being able to make decision by oneself.
> Also note docs are available in the terminal, both inside `iex` or by doing `mix help SomeModule` `mix help Ecto` shows missing task, so I suppose it's a typo (or something I don't have). Help in `iex` (and probably `iex -S mix`) requires dependency download, build, maybe rebuild and some flag and env manipulation (so that app doesn't start entirely) and I hope I started it before breaking compilation because otherwise it won't. Yes, it's there - I agree, but it takes energy to use.
> ...but note we have always listed the versions in the sidebar
I know, however as text moves places the worst possible example of it is like: Search for A - open - notice wrong version - check which one I should be using in code - change to version B in sidebar - not linked, got bounced to home page - search in ExDocs - nothing found (searchbox often fails to return results, see screenshot) - get back to search engine - type exactly version and query - click there. When it repeats multiple time it starts to become unavoidable busywork.
> - "compilation process can break and Elixir isn't aware that it was interrupted" - our tooling has code to deal precisely with this: [https://github.com/elixir-lang/elixir/blob/c5c87a661efac6809...](https://github.com/elixir-lang/elixir/blob/c5c87a661efac6809...). If it still happens, it is a bug and must be reported, so we can fix it
I'm 99% sure that it's a result of circular dependencies and maybe one pass fails but then the other starts overwriting or something. But could also be something in compilation pipeline (we have extra steps). I wish there was something like "mix elixir_checks.compile_consistency" (with a flag to send a bug report). Right now feeling a bug means: isolating and justifying it. It takes energy, especially when codebase is complex, big and ridden with prior decisions. I considered doing that, but I think environment is defensive and I'm easy person to pull into fights, but don't enjoy them.
> "dependency compilation is not parallelized" I made a mental shortcut - i.e it's using only one core right now, and it's taking approx. 2-3 minutes. Looks like PR would resolve it, but not sure when we'll be able to use it.
> when doing umbrella tests on non compiled code seed influences order of compilation In short (I don't know cause) if I run `mix test` from umbrella I'm seeing different compilation order on applications and their dependencies (if I hadn't compiled those before). Those applications aren't guaranteed to be in homogenous dependency state (in fact when I'm looking for dependencies in `mix.exs` I can see popular libraries spread across 3-4 different major versions). Unlucky run happens and consensus is "don't debug `rm -fr _build`).
> Our philosophy here is to emit compilation warnings instead of compilation errors whenever possible, so you can run, debug, and test your code, instead of forcing your code to be pristine while you are still working on it.
This is big pain point for me. I care about some warnings, but not for others (e.g. in libraries that ale planned to be dropped ). I also can't enable those I'd like (deprecated - so my colleagues don't use something we want to sunset in root or dreaded circular dependencies). I solved this by complex check chain and custom filters, but in "competition" I get those out of the box.
I won't say that Elixir is worse technology it's just... I know others which are better (but not BEAM, BEAM is THE BEST)
So you are getting the download count of one package, one that has been added to Erlang/OTP (and Elixir itself) and is more than expected to decrease in download count, to estimate the popularity of the whole language and a ecosystem of 20k+ packages? And over what time period exactly?
> (ExDocs breaking) ... please give an example In screenshot - happened with Chromium,Safari and Firefox ~6 months ago. Often with OTEL libraries.
Got it. I was reminded that there was unfortunately one version of ExDoc with a sidebar bug and they probably were still using it. If you ask the package authors to update `ex_doc` and republish the docs, it should take 2 minutes to fix it. They might already have done it though.
Regarding ExDoc's search, people have been asked for improvements, such as searching on latest version by default and searching across packages. I am glad to say there is work happening towards this area (including soon the ability to search across all of the dependencies of your own project).
The other bug in your screenshot, about Ecto.Query, please report it if you can reproduce it. It is indeed a "wat" bug but I am not sure what could be causing it. EDIT: I was told this may happen if you are using a mocking library, here is a reproduction and a bug report: https://github.com/jjh42/mock/issues/151 - if you are using mocking libraries, please double check if they can be the root cause.
> I made a mental shortcut - i.e it's using only one core right now, and it's taking approx. 2-3 minutes. Looks like PR would resolve it, but not sure when we'll be able to use it.
It should not be using one core, even it if compiles one dependency at a time. My whole point is that you get parallelism from within the dependency/project.
> when doing umbrella tests on non compiled code seed influences order of compilation
Honestly, I have no idea how this could possibly be the case. Elixir's seed is applied per process and, in this case, it is only applied to the process running your tests, which is not the process related to compilation at all. I will drop a comment in your other thread about cycles in your umbrellas, which is also not possible.
> I can see popular libraries spread across 3-4 different major versions)
This is also something that should not be possible. I mean, you can specify different major versions, but the dependency resolution will guarantee they all agree on a single one. For example, you can't have different versions of `Jason` in the same umbrella, unless there is something really unconventional or undesired on how you are building your umbrella apps. So I would need a mechanism to reproduce it in order to pinpoint what. I would double and triple check your apps configuration, it seems there is something really unexpected going on.
> I would double and triple check your apps configuration, it seems there is something really unexpected going on.
More than one thing for sure. It's big and highly heterogenous ecosystem (multiple umbrellas) in distributed environment with high idempotency requirement. Without safeguards decisions were made that today make things very complicated. It's difficult to challenge long used patterns without hard recommendation or concrete evidence (I looked into Perceived Complexity analysis for Elixir but couldn't find anything).
My organization now builds the story of "Elixir codebase is hard to work and unreliable - let's switch to other tools". I don't like that story because I still remember all the fun I had and all the systems I produced that stood years with 0 maintenance. But those were small teams and small projects and today it's an enormous Jenga tower that's risky to breath around.
I would go as far as to say that our codebase is somewhat of a Petri dish for all kinds of issues (especially on dev/test envs, but not only). I've seen code merged to main branch because it wasn't picked up as changed and used stale cache, multiple-Elixir and OTP versions used in compilation, arch spillovers and more.
>I can see popular libraries spread across 3-4 different major versions), This is also something that should not be possible.
We have overrides and I don't see the umbrella test helper so I guess that umbrella-level overrides don't play nice with non-compiled in-app test runs.
> I am glad to say there is work happening towards this area (including soon the ability to search across all of the dependencies of your own project).
Looking forward to it, one thing that I often change too, is changelog across libraries, so it would be nice to always have those up-to-date.
> It should not be using one core, even it if compiles one dependency at a time. My whole point is that you get parallelism from within the dependency/project.
Project or a single-dependency compilation is fine - I felt different after recent updates to our stack and won't complain. In one umbrella I have opened I see ~250 deps packages and deps.tree shows me ~6500 lines of output, some of those are compiled multiple times - I blame the loops.
I have similar CPU - 8 performance cores and 4 efficiency ones. Usually deps.compile takes less than 100% of total CPU with spike to 150%. On partitioned tests I can feel the warmth of 1100% CPU usage (it also makes me smile, because I like big numbers). Right now I'm thinking that maybe I could spawn ~250 containers, make each compile dependency and then merge output into one and see what broke ;-)
> So you are getting the download count of one package, one that has been added to Erlang/OTP (and Elixir itself) and is more than expected to decrease in download count, to estimate the popularity of the whole language and a ecosystem of 20k+ packages?
Not ideal, but the best I could find. I also looked at Ecto which is standard, but figured out that json is more often used in projects than a database. Given quality of software itself I'd expect steady increase on the "core" libraries. But I also hear from prior projects about them being sunsetted. 2 or 3 projects in Elixir less, no big deal. In current organization few of us are actively advocating for Elixir and BEAM. We're minority and newcomers encounter as a first thing difficult stack setup (Erlang and Elixir version) long compilation time, hundreds of odd warnings and LSP that takes 40s to pick up on changes and highlight some errors.
--
I'm not in position of making any demands, it's self inflicted 99% of the time, and it's not a bug that can be fixed upstream it's just a subjective experience, and I wish it could be better.
For example, you can't have loops in deps, and therefore ~250 deps should not print a 6500 thousand entries long tree. At least, for this particular problem, you can isolate your project structure, without any code, and try to reproduce it externally. And, while you can override deps, the goal of an umbrella is to share dependencies, so overriding an umbrella sibling dependency is a smell too.
You said it's an enormous Jenga tower that's risky to breath around but it seems at the same time no one wants to invest on an air purifier. If it is of any help, you can look at the Remote case on the Elixir website (https://elixir-lang.org/blog/2025/01/21/remote-elixir-case/), they have a large codebase, around 15k files, 300 engineers (several dozens being Elixir ones), and while their codebase is healthy, you can see they had to invest on some "bottlenecks" that appeared along the way, such as CI times. And the need to invest in the code base itself will be true of any language as time passes. Best of luck!
I have my own personal project that very quickly started to suffer from similar project and I move between tech, and given very good experiences with Go, that's something I'd be looking at. I like async and Go has fun async and great tooling to break/fix code.
There are some things that rubbed me wrong way about it. Like typing but still giving way to runtime errors in specific scenarios or lack of macros (which is what makes Rust great to work with because otherwise it’s just sea of boilerplate eventually).
IMO the more sensible decision would be moving „down”. One still has to use some Erlang in Elixir (for example for tracing) and there are small benefits I appreciate lately - like visual differentiation between variable and an atom.
I feel you on the macros, I have wanted them too, but I respect the language creator's commitment to minimalism, and I don't feel that e.g. JSON decoders are too much effort. It seems the language is headed down the route of code generators rather than macros, which seems like a reasonable tradeoff to me. [2]
[1] https://tour.gleam.run/table-of-contents/ [2] https://gleam.run/news/improved-performance-and-publishing/#...
I do, however agree with you on macros and code generations. My hand in Rust is macro heavy (I dislike boilerplate) but in Go I learned to appreciate codegen utilities and it might be the way to go.
The topic itself is interesting, because I’ve been doing „business logic in types” and it’s impossible to pull of without invoking so much magic that keyboard starts to emit indigo and that puts Gleam in akward place because when we are at that place maybe it’s easier to write code generation with Prolog/Cue but instead of putting another layer just settle with Erlang/BEAM assembly.
But my problems are more in domain of „what happens when during daylight saving shift I receive an out of order message that should be included in generated raport of order fashion and one of node died at that point”.
Dependencies are commonly in foreign programming languages, where the compiler might run things concurrently outside of BEAM control. It's not uncommon that Mix is configured by dependencies to just pull a binary for the local architecture from some repo instead. Perhaps that's what you ought to do in your CI and deploy flows, instead of compiling everything.
"- a lot of knowledge is implicit (try checking if you can dynamically add children to a Supervisor)"
What do you mean, "implicit"? It's in the docs:
I suspect OP is using an umbrella app as a shared library or something. That is the only explanation I can think of that can cause the issue with compilation order.
About documentation, not quite sure what the OP is talking about. Elixir and Erlang have really good documentation.
Anyway, to truly appreciate Elixir (and for that matter Erlang), one needs to understand OTP and the philosophy behind it. It is not just a language but a framework to build concurrent application.
I did Elixir for a year or so. I have to agree. I had to routinely jump up several layers of call sites to understand what I could do with the arguments.
Development where teams share stuff benefits so much from static types. For teams, the best I have experienced is Go.
The BEAM is great and all other systems are just partial and poor implementations of it, as they say. But k8s does a lot. If I could get typing that actually helped me, like in Go, Elixir would jump up my recommendation list
What? This is very clear: <https://www.erlang.org/doc/apps/stdlib/supervisor.html#start...>, as is this: <https://hexdocs.pm/elixir/1.18.3/Supervisor.html#start_child...>. What am I missing?
> - this order of compilation influences outcome and can lead to buggy code
Do you have an example of this? It's my understanding that the big thing about both Erlang and Elixir are that they're functional languages, it doesn't matter what's compiled when. Is this some nightmare compile-time code manipulation thing?
> ...searching [the docs] shows random versions and there isn’t even a link to „open newest”
Is scrolling to the top of the list contained in the version selector near the top left of hexdocs.pm not good enough? If not, why not?
The docs problem is more of a Google problem. For some reason Google still only shows the 1.12 docs for a lot of searches. The sidebar issue was fixed more recently, I think in the last year. But basically the sidebar wouldn't get loaded until Mermaid finished loading, so it was updated to defer loading of Mermaid. The latest version of ExDoc shouldn't have this problem.
No, my experience says that's not true. I have this code in one of my projects and it works just fine:
-behavior(gen_server).
start(Mod, Args) ->
supervisor:start_child(secret_project_worker_sup:getName(), [{Mod, Args}]).
I can spawn as many of those guys as I like and they all become children of the named supervisor. The named supervisor is a 'simple_one_for_one' supervisor with a 'temporary' restart policy.I guess the thing that might trip folks up with how the docs are worded is not noticing this further up in the document
A supervisor can have one of the following restart strategies specified with the strategy key in the above map:
...
* simple_one_for_one - A simplified one_for_one supervisor, where all child processes are dynamically added instances of the same process type, that is, running the same code.
and that 'start_child/2' accepts EITHER a 'child_spec()' OR a list of terms -spec start_child(SupRef, ChildSpec) -> startchild_ret()
when SupRef :: sup_ref(), ChildSpec :: child_spec();
(SupRef, ExtraArgs) -> startchild_ret() when SupRef :: sup_ref(), ExtraArgs :: [term()].
and that only the 'child_spec()' type can have an identifier, so the first bullet point in the list of three in the function documentation does not apply.Also, I find the way the docs USED to print out function types a bit easier to understand than the new style: <https://web.archive.org/web/20170509120825/http://erlang.org...>. (You will need to either close the Archive.org nav banner or scroll up a line to see the first line of the function type information, which is pretty informative.)
defmodule Testing.Application do
use Application
@impl Application
def start(_type, _args) do
children = []
opts = [strategy: :one_for_one, name: Testing.Supervisor]
Supervisor.start_link(children, opts)
end
end
defmodule Testing.Server do
use GenServer
def start_link(_), do: GenServer.start_link(__MODULE__, [])
@impl GenServer
def init(_), do: {:ok, nil}
end
When you try to start more than one child, it fails: Erlang/OTP 25 [erts-13.2.2.11] [source] [64-bit] [smp:14:14] [ds:14:14:10] [async-threads:1] [jit:ns]
Interactive Elixir (1.17.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> Supervisor.start_child(Testing.Supervisor, {Testing.Server, []})
{:ok, #PID<0.135.0>}
iex(2)> Supervisor.start_child(Testing.Supervisor, {Testing.Server, [:x]})
{:error, {:already_started, #PID<0.135.0>}}
But defining a child spec that sets the id: defmodule Testing.Server do
use GenServer
def start_link(_), do: GenServer.start_link(__MODULE__, [])
def child_spec(arg) do
id = Keyword.get(arg, :id)
%{id: id, start: {__MODULE__, :start_link, [[]]}}
end
@impl GenServer
def init(_), do: {:ok, nil}
end
solves the problem: Erlang/OTP 25 [erts-13.2.2.11] [source] [64-bit] [smp:14:14] [ds:14:14:10] [async-threads:1] [jit:ns]
Interactive Elixir (1.17.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> Supervisor.start_child(Testing.Supervisor, {Testing.Server, id: 1})
{:ok, #PID<0.135.0>}
iex(2)> Supervisor.start_child(Testing.Supervisor, {Testing.Server, id: 1})
{:error, {:already_started, #PID<0.135.0>}}
iex(3)> Supervisor.start_child(Testing.Supervisor, {Testing.Server, id: 2})
{:ok, #PID<0.136.0>}
Oh, sure, you can vary the ID in non-'simple_one_for_one' supervisors there to make that work. Apologies for inducing you to write out all that transcript and code.
But, OP's claim was:
> - a lot of knowledge is implicit (try checking if you can dynamically add children to a Supervisor)
which is just not fucking true no matter how you slice it. It's true that the relevant documentation doesn't literally say "Calling 'start_child/2' is valid for any kind of 'supervisor'. That's why it's here... to dynamically add children to a 'supervisor'.", but if one bothers to read the docs on supervisors and the function in question it's clear that that's the entire point of 'start_child/2'.
So I go into docs, Elixir because that’s the primary source and when I search for „dynamic” and „supervisor” everything points to (no fanfares) DynamicSupervisor. And yet I look at the diff where 12 months earlier my colleague changed DynamicSupervisor to Supervisor because connection adapter tended to crash and not start back up and that day I debug zombie connections.
Erlang has it clearly explained, and ultimately, over couple hours of research I found solution and fixed it, but there was no warning that between two lines of shutdown and delete BEAM could restart a child leading to a shadowing (and - in that case - unmanaged as adapter manager lives in other app that has a slot for a single process only) zombie connection handler.
This is the stuff people rarely look at because many systems don’t have high idempotency requirements but (only a parabole, that’s not my industry) would you like your system to administer double dosage of potentially lethal drugs? None is preferable but it’s still far from happy system land.
As for Hexdocs.pm - this goes once again about operating cost. Yes, I can do this, but search is not great and I more often have a broad queries and search for patterns or guidelines. I rely on „map mode” (how I can find knowledge) and not on „collect mode” (i.e. I keep knowledge), due to some mostly intrinsic traits. And thanks to wonders of modern human tracking^W^Wtechnology I can show anecdotal data of the impact.
When working with Elixir I spent (one random, recent day) 40% of my time on browsing Hexdocs and 60% in editor+console.
One random recent day on Go I spent 95% time in E+C and 5% in documentation.
My first line of code written in Go was less than 6 months ago, my first line of Elixir was written years ago, I would’ve to check my resume when exactly, but somewhere like a 8 years and I toyed with it when it wasn’t yet deemed production ready.
If you're talking about calling 'terminate_child/2' followed by 'delete_child/2', there is a very explicit warning. From <https://hexdocs.pm/elixir/1.18.3/Supervisor.html#terminate_c...>
> A non-temporary child process may later be restarted by the supervisor.
and from <https://www.erlang.org/doc/apps/stdlib/supervisor.html#termi...>
> The process, if any, is terminated and, unless it is a temporary child, the child specification is kept by the supervisor. The child process can later be restarted by the supervisor.
In the English-language docs, the warning is very clear: children that a supervisor will restart on termination may be restarted before you get around to calling 'delete_child'. Children with a restart type of 'permanent' will always cause a race between 't_c' and 'd_c'. Children with a restart type of 'transient' will cause a race if they terminate abnormally... which is a term defined by the same document that the warning comes from.
You keep insisting that this stuff isn't documented. Are you perhaps reading a poor translation of the docs?
> You’re coming from position of knowledge.
No, I'm not. I've forgotten most of the details I ever knew about how the system works and only remember broad strokes. My experience with Erlang/OTP was scattered throughout my spare time over an eight to twelve month period ten years ago. Unlike you, I've never been paid to work with it, and you've worked with it far more recently than I.
The reason I was able to direct you to the right part of the docs was because I said to myself "Wow, it would be fuckin stupid if you couldn't dynamically add children to a supervisor. I remember the Erlang docs being really, really good, so let's see if they failed to describe if and/or how this works.", and then spent like five minutes reading the docs and another couple cross-checking with the Elixir docs.
> I don’t memorize documentation because it changes often...
Do the rules for "+", "-", and (if present) "%" change often in most major languages when they are used with built-in types? [0] It's always well worth your time to learn the rules for commonly-used, bedrock parts of a language, major libraries, and runtime systems that you intend to use. Bedrock parts don't substantially change, because if they did, they would invalidate every program ever written against those parts.
OTP provides a collection of bedrock parts, and supervisors are one such part. Memorizing docs is really stupid, but having a solid understanding of how the major things you use work is always worth the time.
Seriously, how would you function as a programmer if you didn't know how addition or string concatenation worked? If you're using OTP's supervisors, having a good understanding of how they function is just as fundamental.
> One random recent day on Go I spent 95% time in E+C and 5% in documentation.
Sure, that makes some sense. OTP is far more complex and robust than anything Go offers, so it's quite a bit quicker to come up to speed with what's documented in the Go official docs than what's in the OTP docs. Also, as someone who has written Go professionally for the last five, ten years, I warn you that you're going to get turbofucked by the things that -if they are documented at all- are documented only in blog posts or random tutorials. Go is an absolute grab bag of poorly-documented sharp edges and surprising behavior.
> I rely on „map mode” (how I can find knowledge) and not on „collect mode” (i.e. I keep knowledge)... [and] I can show anecdotal data of the impact.
The impact of you failing to familiarize yourself with the necessarily-complex tools you chose [1] to use seems pretty clear to me. You got tripped up by documented behaviour that you didn't bother to understand, which caused you to spend hours looking for solutions that would have been clear after a ten minute trip to the documentation for the 'supervisor' module. Given what you've said elsewhere in this subthread about how your project is an "an enormous Jenga tower that's risky to breath around", I suspect that a significant number of your coworkers also refused to familiarize themselves with the tools they use.
[0] No, they do not.
[1] (or were obligated)
In terms of dynamic or in terms of behavior between them? For former - yes they do change, not often, but they do. Even Elixir is right now raising warnings that `-0.0` and `+0.0` will not be equal, which implies also changes in addition and subtraction (e.g. cancelling out event's value in event based system value might impact on system's behavior).
If that's the latter then it deserves blog post on its own, because some can add mixed types, some are casting in specfic way, some are copying data, some are mutating data, some are doing heuristic casting, some are crashing, some leak memory, some allow modifying pragmas, some allow implicit overloading.
It's a jungle out there. ...and it reminds me about academic joke that '2 + 2 = 5 given extremely high values of 2' - funny one until you spend night trying to figure out why this happens and another two planning vengeance on person who decided that int->float->int is a good trick to use helpful float-taking function on an otherwise perfectly fine integer.
The worth of remembering is a concept I perceive a consideration a cost, usefulness and available memory space. I rather remember that in Elixir/BEAM child shutdown and removal is message driven (and thus can cause race condition) than whether I need to use `+` or `++` for concenating lists.
It's a good thing I asked specifically about built-in types in a particular system, and didn't ask about comparisons between operators in different languages.
> For former - yes they do change, not often, but they do. Even Elixir is right now raising warnings that `-0.0` and `+0.0` will not be equal...
Sure, that's a change to optional behavior to comparison of floating-point zeros. That doesn't change how equality testing, addition, subtraction, or -if available- modular arithmetic works.
As I said:
> It's always well worth your time to learn the rules for commonly-used, bedrock parts of a language, major libraries, and runtime systems that you intend to use. Bedrock parts don't substantially change, because if they did, they would invalidate every program ever written against those parts.
I disagree. I've been long enough around to see languages sunsetted, libraries sunsetted. Big systems are standing on shamefully old versions. If your job is to work on one language - I agree, but when working with 100s of systems that go over multiple OTP versions, multiple Elixir versions, sprinkled with JavaScript, TypeScript, Ruby 1.0, Elm, Java, "oh my dear is it Python 2 running CoffeeScript?!", then memorizing anything is pointless, because chance is that thing that you memorized is:
- not yet in this project
- no longer in this project
- that tech isn't in the project
- project is written in Malboge, everything you know is irrelevant
- is explicitly forbidden by code owner (for more or less sensible reason)
> Bedrock parts don't substantially change, because if they did, they would invalidate every program ever written against those parts.
Been there, done that, bought a t-shirt. I dislike TypeScript for exact that reason [0], but in Elixir the same is true if you rely on --warnings-as-errors flag due to (in my opinion) broken deprecation mechanism.
Software is full of leaky abstractions. Do you know that it's not guaranteed that your system clock is monotonic? [1]
[0]: https://github.com/microsoft/TypeScript/wiki/Breaking-Change... [1]: https://github.com/rust-lang/rust/blob/e2223c94bf433fc38234d...
Yes. Wall-clock time is adjustable. That's why there's a monotonic clock function on any serious OS that's running on hardware that makes such a function possible.
> ...but when working with 100s of systems that go over multiple OTP versions [it's not worth understanding how anything that's bedrock works]...
Welp, let's go back to the behavior of 'supervisor' in 2007... the earliest version of that page of the docs that the Wayback Machine has: <https://web.archive.org/web/20070707071556/http://www.erlang...>
Hey, look at this description and warning in 'terminate_child/2'
> Tells the supervisor SupRef to terminate the child process corresponding to the child specification identified by Id. The process, if there is one, is terminated but the child specification is kept by the supervisor. This means that the child process may be later be restarted by the supervisor. The child process can also be restarted explicitly by calling restart_child/2. Use delete_child/2 to remove the child specification.
Hell, read the rest of that document... notice that the behavior described from nearly twenty years in the past is the same as now. (And I bet you One American Nickel that the behavior described in 1997 is also the same.)
If you don't consider that to be bedrock functionality and worth familiarizing yourself with, I don't know what to tell you.
As I showed in the other post, this is incorrect (at least in Elixir documentation).
Fortunatelly I read that last, as I'd refrained from further conversation but regarding my situation and memory - I didn't choose so, I was born with a specific type of memory and specific traits. It's useful and it built me a rewarding career. Often problems in software are caused by assumptions and I can't have any. Thanks to that I can work on interesting systems that have hair pulling problems.
However this attitude of both shaming "you should just memorize" and "works for me" approach is one I seen often in Elixir's community and why I don't want to have such conversations in official places. I don't feel a need to be present in environment where I'm not welcome. And yet, peculiarily, I'm often brought as a decision maker regarding recommending choosing or sunsetting technologies and given lack of parameters I do fall back and it wasn't that great.
Okay? I'm doing neither, so I don't see why you're bringing that up. I've consistently rebutted your claims that something wasn't explicitly documented by pointing out where it's explicitly documented. I've also called memorization of documentation a fucking stupid thing to do.
> As I showed in the other post, this is incorrect (at least in Elixir documentation).
As I've mentioned in the other post, I don't see how this is incorrect, and await your detailed walkthrough.
This is not written anywhere explicitly in the docs - I also agree that Erlangs documentation is much better but I’m not saying that Erlang is missing information. I’m talking about Elixir not providing this and marking clearly - because if I need to start reading in Erlang first then why would I layer Elixir on top of it? This is exactly the thing I’m pointing out.
Because your response is long Ill only focus on this point and (hopefully) get back later.
My expectation (implicit) would be that when function is doing 2 lines the messages would be locally ordered. Yes, maybe that’s silly, but in many other languages that’s exactly the case. If I send messages to queue I’m aware that queue might not get two of those. I need to send a transaction, fine. If I broadcast or make a signal/event same happens. But here I have synchronous function with no indication or warnings that it’s a message.
If this can’t be known in documentation, isn’t caught by compiler/analysis, but requires experience or (often) reading source code it is implicit knowledge.
Yes, I posses it too now, but I think it’s a problem.
It absolutely is. I'll use the Elixir docs as my source:
> A non-temporary child process may later be restarted by the supervisor.
And, further up in the docs when talking about the circumstances under which a supervisor will restart a child that has terminated: [0]
Restart values (:restart)
The :restart option controls what the supervisor should consider to be a
successful termination or not. If the termination is successful, the
supervisor won't restart the child. If the child process crashed, the
supervisor will start a new one.
The following restart values are supported in the :restart option:
:permanent - the child process is always restarted.
:temporary - the child process is never restarted, regardless of the
supervision strategy: any termination (even abnormal) is considered
successful.
:transient - the child process is restarted only if it terminates
abnormally, i.e., with an exit reason other than :normal, :shutdown, or
{:shutdown, term}.
For a more complete understanding of the exit reasons and their impact, see
the "Exit reasons and restarts" section.
And the "Exit reasons and restarts" section says: [1]> A supervisor restarts a child process depending on its :restart configuration. For example, when :restart is set to :transient, the supervisor does not restart the child in case it exits with reason :normal, :shutdown or {:shutdown, term}.
You go on to say:
> But here I have synchronous function [to affect the state of a supervisor] with no indication or warnings that it’s a message.
Before I get into that, I have two questions for you:
1) How do you affect an Erlang or Elixir process without sending it a message? The docs for Processes [2] don't indicate any other way.
2) Have you never seen or written a function that does not return until it receives the response to an async operation?
Continuing on... from the top of the Supervisor docs, we see:
> A supervisor is a process which supervises other processes, which we refer to as child processes.
"A supervisor is a process...", straight off the bat. That's super clear and explicit, but I'll keep walking through the docs to show you how else this information is communicated to the reader.
If we read on, we see that the first argument to the 'stop_child/2' and 'delete_child/2' functions is of type 'supervisor()', which is defined as '@type supervisor() :: pid() | name() | {atom(), node()}'. What are these? Well, check the docs for how you start a Supervisor. [3] They say three interesting things:
1) The second argument to 'start_link/2' is of type 'option()', which is defined as '{:name, name()}', and 'name()' is defined as 'atom() | {:global, term()} | {:via, module(), term()}' . Keep those types in mind.
2) "If the supervisor and all child processes are successfully spawned (if the start function of each child process returns {:ok, child}, {:ok, child, info}, or :ignore), this function returns {:ok, pid}, where pid is the PID of the supervisor. If the supervisor is given a name and a process with the specified name already exists, the function returns {:error, {:already_started, pid}}, where pid is the PID of that process."
Notice how often it talks about "spawning" the supervisor and returning a PID, and saying that that PID is the PID of the supervisor you just created, or of a named supervisor that already exists.
3) "The options can also be used to register a supervisor name. The supported values are described under the "Name registration" section in the GenServer module docs."
Let's look at the "Name registration" section. [4] I'm not going to quote the whole thing because it'd be a nightmare to reformat sensibly, but the two key sections are
> Both start_link/3 and start/3 support the GenServer to register a name on start via the :name option. Registered names are also automatically cleaned up on termination. The supported values are: an atom ... {:global, term} ... {:via, module, term}...
and the last four items in the bulleted list in the section beginning with
> Once the server is started, the remaining functions in this module (call/3, cast/2, and friends) will also accept an atom, or any {:global, ...} or {:via, ...} tuples. In general, the following formats are supported:
Notice how those bullets match up to the 'name()' type that is passed in to supervisor:start_link/2, and connect that information with the fact that the docs for that function direct you here to learn about how you can register a name for your supervisor. Combine that information with the fact that the first argument to the "Tell the supervisor to do something" functions is of type 'supervisor()' and the fact that 'start_link' returns a PID, and it's really, really clear that a supervisor is another process that you can (optionally) name and refer to by name, rather than PID.
Once we understand that a supervisor is a process, and that the functions to instruct a supervisor to do things require the information required to contact a process, what other conclusion can we draw than "Communications with a supervisor is async, because communications with all processes are async."?
[0] <https://hexdocs.pm/elixir/1.18.3/Supervisor.html#module-rest...>
[1] <https://hexdocs.pm/elixir/1.18.3/Supervisor.html#module-exit...>
[2] <https://hexdocs.pm/elixir/1.18.3/processes.html>
[3] <https://hexdocs.pm/elixir/1.18.3/Supervisor.html#start_link/...>
[4] <https://hexdocs.pm/elixir/1.18.3/GenServer.html#module-name-...>
def start_new(name, config) do
# Logging set up
Supervisor.start_child(
name,
{ HandlerModule, config }
)
end
def replace_supervisor(name, config) do
Supervisor.terminate_child(name, HandlerModule) # Success
Supervisor.delete_child(name, HandlerModule) # Failure
start_new(name, config)
end
That is exact code. Success and failure were logged. Also (from Erlang's documentation)> one_for_one - If one child process terminates and is to be restarted, only that child process is affected. This is the default restart strategy.
In terminate child you can read that (once again Erlang).
> If the supervisor is not simple_one_for_one, Id must be the child specification identifier. The process, if any, is terminated and, [[unless it is a temporary child, the child specification is kept by the supervisor]]. The child process can later be restarted by the supervisor.
https://www.erlang.org/doc/apps/stdlib/supervisor.html#termi...
So yeah, Elixir documentation is wrong.
Sorry, what happened after or during the call to delete_child/2 that caused you to consider it to have failed?
> So yeah, Elixir documentation is wrong.
I don't see what's wrong about the Elixir documentation. Walk me through it, please? Do remember that the default restart strategy for a supervisor is 'permanent', and that 'one_for_one' only ensures that the supervisor-initiated restart of one supervised child doesn't cause the supervisor to restart any other supervised children.
After tracing the code this is exactly what happened (in this code exactly):
1. Terminate child X
2. /Supervisor restarts X/
3. Delete child X {:error, :running}
4. Supervisor.start_child Y {:ok, PID}
5. /X and Y are both running/
As for incorrectness:> the supervisor does not restart the child in case it exits with reason :normal, :shutdown or {:shutdown, term}.
`terminate_child` is sending shutdown and yet it's being restarted.
And to emphasise on use case. The child is connection handler. Service node changed. It NEEDS to be restarted on crash, but has to be replaced during handoff.
I believe you start to get into "huh?" mode with me. I have a treasure trove of those. (Btw., in Erlang repository there's plenty of notes mentioning THIS exact behavior and if I didn't overskim - even some bugs caused by it - you can search for terminate_child.
I question why you're handing off things between supervisors. If this is something you actually need to do, then 'delete_child/2' so the supervisor doesn't restart the child, terminate the child yourself, and re-start the child on the new supervisor.
EDIT: Actually, no, you can't 'delete_child/2'. You need to change the supervisor type from 'permanent', to the type that does exactly what you say you need. I'll leave it to you to read the docs. /EDIT
> `terminate_child` is sending shutdown and yet it's being restarted.
Here's the context for that partial quote that you pulled from [0]:
> A supervisor restarts a child process depending on its :restart configuration. For example, when :restart is set to :transient, the supervisor does not restart the child in case it exits with reason :normal, :shutdown or {:shutdown, term}.
Re-read that first sentence that you chose to not quote. Then read about the ':restart' supervisor configuration and how it describes when a supervised child is and is not restarted. [1]
> I believe you start to get into "huh?" mode with me.
Yep. Selective quoting when it's trivial for your conversation partner to find the lies by omission definitely put me into "huh?" mode with you.
[0] <https://hexdocs.pm/elixir/1.18.3/Supervisor.html#module-exit...>
[1] <https://hexdocs.pm/elixir/1.18.3/Supervisor.html#module-rest...>
I won't advocate strongly, but I think some designs are more clear in Erlang. E.g. nested structs and maps in Elixir are something I consider problem-to-be.
If one writes deeply nested structure in Erlang it looks like syntax vomit, so I'd avoid it. But I haven't been writing in Erlang for a while, so that might be just an illusion.
But it could simplify code while being perfectly compatible, so I wonder (and there's LFE too which feels like something I both want and don't want to touch)
ExDoc support outputting EPUB as well as HTML.
> The Supervisor module was designed to handle >>mostly<< static children that are started in the given order when the supervisor starts.
Emphasis mine. But I will spread knowledge that: yes, one can add child to a running Supervisor in runtime and Supervisor will try to use supervision strategy on it (as opposed to DynamicSupervisor that immediately forgets about its child). Replacing a child requires stop-remove in loop (see sibling comment response for exact case).
Apart from some statically typed languages (and even then), most languages have this. Very rarely in any ecosystem does a dependency upgrade not also require manual changing of stuff.
> - when doing umbrella tests on non compiled code seed influences order of compilation > - this order of compilation influences outcome and can lead to buggy code
I've never seen compilation produce odd artifacts, especially not as a result of compilation order. If the code has the proper compile-time deps, then the result seems stable.
> - documentation is hard to use - searching shows random versions and there isn’t even a link to „open newest”
Isn't this the fault of the search engine not having the latest version indexed? There's also a version selector on the top-left of hexdocs. Navigating to hexdocs.pm/<LIBRARY_NAME> also opens the latest version. This seems like a non-issue to me.
> - a lot of knowledge is implicit (try checking if you can dynamically add children to a Supervisor)
Already covered by another commenter, but also: https://hexdocs.pm/elixir/DynamicSupervisor.html I don't think the knowledge is necessarily implicit, it's just that learning Elixir deeply also means learning the BEAM/Erlang deeply, and there's a lot of Erlang docs.
> - sidebar with new ExDocs break for some reason so there is no navigation
Not a universal problem. Perhaps look into why it's broken on your device and report it as a bug?
> - there is no way to browse navigation outside this broken ExDocs which outputs only HTML and LSP
There's iex. For example `h String.codepoint`. Aside from that, I sometimes just open the relevant library in my deps/ directory and search there.
> - Squiggly arrow works weird (i.e. ~0.3 might catch 0.99)
I genuinely don't understand what you mean by this. There's no tilde operator and ~0.3 is invalid syntax. Can you give a code sample?
> - dependency compilation is not parallelized
I think this might be related to the common pattern of code you see in libraries:
if Code.ensure_loaded?(SomeModuleFromAnotherLib) do
# Some lib is loaded, add code to integrate with it.
end
I think that could only be solved if that integration were extracted into another lib with properly setup deps, but due to how common this pattern is I don't think it's ever possible to switch to parallel dep compilation.> - Dialyzer/dialyxyr typespecs are useless most of the time
A type system is being worked on, and each release lately has added more checks.
I agree that compilation is slow & editor integration is meh, but the rest I don't agree with.
Elixir does some computations as well but when we had to compute 3D luts based on video processing algorithms, Ghislain had to write them in C to be fast enough for our needs on embedded hardware.