uv tool install llm
llm install llm-cmd
llm cmd use ffmpeg to extract audio from myfile.mov and save that as mp3
https://github.com/simonw/llm-cmdSo much like code assistance, they still need a fair amount of baby sitting. A good boost for experienced operators but might suck for beginners.
And in some situations ffmpeg has some warts you have to go around. Like they introduced recently a moronic change of behaviour where the first sub tracks becomes forced/default irrespective of the original forced/default flag of the source. You need to add "-default_mode infer_no_subs" to counter that.
It's another tool and one that might actually improve with time. I don't see GNU's man pages getting any better spontaneously.
Whoa, what if they started to use AI to auto-generate man pages...
That’s the time to start my career in woodworking.
It works really well.
It is not simply a "generate man page" without context.
Then they'd be wrong about 20% of the time, and still no one would read them. ;-)
(NB: I'm of the age that I do read them, but I think I'm in the minority.)
Or he can just ditch the car and walk. Sure, it's slower and requires more effort, but he knows exactly how to do that and where it will take him.
If we could imagine wiring a pony to control a car, its brain, while good at navigation, would likely be inadequate at the speed that a car attains.
I no longer check with these AI tools after a number of attempts. Unrelated, a friend thought there was a NFL football game last Saturday at noon. Checking with Google's Gemini, it said "no", but there was one between two teams whose season had ended two weeks before at 1:00 Eastern Time and 2:00 Central. (The times are backwards.)
I don't think the notion of "current" has been explained to them. Thay just define it out of context.
Ask them about the fire in LA in 2025 January.
Did any of the commands look like the ones in the left window:
https://beta.gitsense.com/?chats=12850fe4-ffb1-4618-9215-c13...
The left window contains a summary of all the LLMs asked, including all commands. The right window contains the individual LLM responses.
I asked about gotchas with missing libraries as well, and Sonnet 3.5 said there were. Were these the same libraries that were missing for you?
Where I find GPT to perform better than Sonnet is with text processing. GPT seems to better understand what I want when it comes to processing documents.
I'm convinced that no LLM provider has created or will create a moat, and that we will always need to shop around for an answer.
4o replaced 4 back in April 2024. 01/01mini replaced 4o in Fall 2024.
stop using 4. use 01mini always. its cheaper, faster, and better.
o1/o1mini will be replaced by o3/o3mini in a couple months.
The months old sonnet feels a generation ahead of any OAI product I've used, I'll believe the hype on o3 when I see it, remember the sora and voice roll out?
I had this bizarre bug in rust networking code where packets were getting dropped.
i dumped all 20k lines into o1pro. it thought for about ten minutes and came back telling me that my packets had a chance of being merged if set in quick succession and i needed to send the length before each message and scan packets in a loop for subdivisions on the client. this bug hadnt happened before, only when running locally on a newer faster machine, and was frequent but hard to replicate.
it was correct, and provided detailed pseudo code to solve it.
the second case involved some front end code where during an auth flow ios would force refresh on returning to the browser causing authentication state to be lost. o1pro thought for about 5 minutes before telling me ios has a heuristic with which it decides to close an app on context switch based on available ram, etc, and that i needed to conditionally check for ios and store partial state in local store on leave assuming the app could be deloaded without my control.
it was correct. with some more back and forth we fixed the bug.
these are not the kinds of problems that claude and gpt<4 have been able to help with at all.
I also used voice, and video voice extensively for translation tasks in korea, japan, and taiwan, and for controlling japanese interfaces and forms for tax documents and software.
These are very good tools.
Perhaps at some point there will be a triage LLM to slurp up the problem and then decide which secondary LLM is most optimal for that query, and some tertiary LLMs that execute and evaluate it in a virtual machine, etc.
Maybe someday
Usually the color was wrong and I don’t care enough to learn about colorspaces to figure out how to fix it and it’s utterly insane how difficult it is even with LLMs.
Just reencode it as is but a little more lossy. Is that so hard?
as in
bash> please "use ffmpeg to extract audio from myfile.mov and save it as mp3"
It will then courteously show you the command it wants to run before you agree to do it.Here is the whole thing, with its two dependent functions, so that people stop writing their own versions of this lol. All it needs is an OPENAI_API_KEY, feel free to modify for other LLMs
EDIT: Moved to a gist: https://gist.github.com/pmarreck/9ce17f7996347dd532f3e20a2a3...
Suggestions welcome- for example I want to add a feature that either just copies it (for further modification) or prepopulates the command line with it somehow (possibly for further modification, or even for skipping the approval step)
(honestly, the work you share is very inspiring)
I appreciate the UI choice here. I have yet to do anything with AI (consciously and deliberately, anyway) but this sort of thing is exactly what I imagine as a proper use case.
As a helper and not a replacement, this sounds grand. Like the most epic autocomplete. Because I hate how much time I waste trying to figure out the command line incantation when I already know precisely what I want to do. It’s the weakest part of the command line experience.
ffmpeg commands though? It's really not a practical skill outside of using ffmpeg. There's nothing really rewarding to me about memorizing awkwardly designed CLI incantations. It's all arbitrary.
I'm talking about
1. you have a problem, you try something and it doesn't work. 2. you find an LLM and it "gives" you the answer with one or two tries 3. problem solved! what have you learned? How to have answers given to you when you ask.
or 2. you look for an answer in a dizzying haze of man pages, quacks, and website Q&As. 3. you try and try again and eventually problem solved. you have learned not only how to solve a particular problem but your overall ability to solve similar problems has done up. You've learned how to fish, not just ask an LLM for a fish.
and yet, i too know that it is perhaps not quite as good for me as finding the fish on my own lake
at least i can keep these fish in my own warm pond, and go back to them whenever i like
after decades of ice fishing, i think the LLM fish are quite good... and even when they are no good, that fishing experience makes it so easy to go back to the LLM and get the exact right fish for my pond
it will continue to be helpful for everyone in the future if we keep publishing the contents of our ponds, whether it's a web site like this, or in a repository
Learning a tool is useful, even invaluable, but if you don't have the persistence to use it, it's useless.
And many tools are just partially useful under some conditions. So creativity in using them is also useful.
So it's not about the tools, its about not giving up and trying different things, which makes all tools more effective, and problem-solving more likely.
It's not the only approach, and definitely as someone once said, premature optimization is the root of all evil, and maybe you are optimizing at exactly the right time.
Who can tell?
And speaking of ffmpeg, or tooling in general, I tend to make notes. After a while you end up with a pretty decent curated reference.
In one tool you’ll use + to match one or more times, and \+ to mean literal plus sign.
In another tool you’ll use \+ to match one or more time, and + to mean literal plus sign.
In one tool you’ll use ( and ) to create a match group, and \( and \) to mean literal open and close parentheses.
In another tool you’ll use \( and \) to create a match group, and ( and ) to mean literal open and close parentheses.
This is basically the only problem I have when writing regexes, for the kinds of regexes I write.
Also, one thing that’s not a problem per se but something that leads me to write my regexes with more characters than strictly necessary is that I rarely use shorthand for groups of characters. For example the tool might have a shorthand for digit but I always write [0-9] when I need to match a digit. Also probably because the shorthand might or might not be different for different tools.
Regexes are also known to be “write once read never”, in that writing a regex is relatively easy, but revisiting a semi-complicated regex you or someone else wrote in the past takes a little bit of extra effort to figure out what it’s matching and what edits one should make to it. In this case, tools like https://regex101.com/ or https://www.debuggex.com/ help a lot.
They're both tools where if they're part of your daily workflow you'll get immense value out of learning them thoroughly. If instead you need a regex once or twice a week, the benefit is not greater than the cost of learning to do it myself. I have a hundred other equally complicated things to learn and remember, half the job of the computer is to know things I can't put in my brain. If it can do the regex for me, I suddenly get 70% of the value at no cost.
Regex is not a tool I need often enough to justify the hours and brain space. But it is still an indespensible tool. So when I need a regex, I either ask a human wizard I know, or now I ask my computer directly.
My kid uses wifi, google classroom tools, youtube, games,... I can tell him if only you knew command.com, ipconfig, doom.wad formats, lateg,... you could be so much more proficient. I already know this will never happen, just like I never learned x86 assembly.
The same goes for tools like LLMs, once you are used to them, your knowledge shifts.
"The Insanity Of Linux's Regular Expressions " https://www.youtube.com/watch?v=ys7yUyyQA-Y
All in all, my life would be miserable if I would not have regexp available in grep/sed/editor/ide/java/python, their usefulness trump any such inconveniences.
does anyone know of any?
What are your current issues or what limitations have you ran into?
If you have ffmpeg installed and an OpenAI env api key set, it should work out of the box.
Demo: https://image.non.io/1c7a92ef-0917-49ef-9460-6298c7a9116c.we...
i started with avisynth, and it took time for my brain to switch to ffmpeg. i don't know how i could function without ffmpeg at this point
The sense of elation I get when I wonder aloud to my digital friend and they generate what I thought was too much to expect. Well worth the subscription.
ffmpeg <Input file(s)> <Codec(s)> <MAPping of streams> <Video Filters> output_file
- input file: -i, can be repeated for multiple input files, like so: ffmpeg -i file1.mp4 -i file2.mkv
If there is more than one input file then some mapping is needed to decide what goes out in the output file.- codec: -c:x where x is the type of codec (v: video, a: audio or s:subtitles), followed by its name, like so:
-c:v libx265
I usually never set the audio codec as the guesses made by ffmpeg, based on output file type, are always right (in my experience), but deciding the video codec is useful, and so is the subtitles codec, as not all containers (file formats) support all codecs; mkv is the most flexible for subtitles codecs.- mapping of streams: -map <input_file>:<stream_type>:<order>, like so:
-map 0:v:0 -map 1:a:1 -map 1:a:0 -map 1:s:4
Map tells ffmpeg what stream from the input files to put in the output file. The first number is the position of the input file in the command, so if we're following the same example as above, '0' would be 'file1.mp4' and '1' would be 'file2.mkv'. The parameter in the middle is the stream type (v for video, a for audio, s for subtitles). The last number is the position of the stream IN THE INPUT FILE (NOT in the output file).The position of the stream in the output file is determined by the position of the map command in the command line, so for example in the command above we are inverting the position of the audio streams (taken from 'file2.mkv'), as audio stream 1 will be in first position in the output file, and audio stream 0 (the first in the second input file) will be in second position in the output file.
This map thing is for me the most counter-intuitive because it's unusual for a CLI to be order-dependent. But, well, it is.
- video filters: -vf
Video filters can be extremely complex and I don't pretend to know how to use them by heart. But one simple video filter that I use often is 'scale', for resizing a video:
-vf scale=<width>:<height>
width and height can be exact values in pixels, or one of them can be '-1' and then ffmpeg computes it based on the current aspect ratio and the other provided value, like this for example: -vf scale=320:-1
This doesn't always work because the computed value should be an even integer; if it's not, ffmpeg will raise an error and tell you why; then you can replace the -1 with the nearest even integer (I wonder why it can't do that by itself, but apparently, it can't).And that's about it! ffmpeg options are immense, but this gets me through 90% of my video encoding needs, without looking at a manual or ask an LLM. (The only other options I use often are -ss and -t for start time and duration, to time-crop a video.)
ffmpeg -i <movie-with-many-tracks.mkv> -map 0:0 -map 0:5 -map 0:12 -vcodec copy -acodec copy -scodec copy "output-movie.mkv"
Use: sometimes I have a file with a lot of audio and or subtitle streams but only want one or two of each – here, 0:0 is the video, 0:5 is English audio, and 0:12 was the subtitle track I wanted. Setting the codecs to “copy” means nothing gets reencoded.It's not about integer, but some of the sizes need to be even. You can use `-vf scale=320:-2` to ensure that.
But thanks for '-2', didn't know about that! It's the exact default option I needed! Will be using that always from now on.
https://stackoverflow.com/questions/71092347/ffmeg-option-sc...
Likely because the aspect ratio will no longer be the same. There will either be lost information (cropping), compression/stretching, or black bars, none of which should be default behaviour. Hence, the warning.
I had a problem I'd been thinking about for some time and I thought "Ill have some LLM give me an answer" and it did - it was wrong and didn't work but it got me to thinking about the problem in a slightly different way and my quacks after that got me an exact solution to this problem.
So I'm willing to give the AI more than partial credit.
If the CLI is installed, you can do: gencmd -c ffmpeg extract first 1 minute of video
Or you can just search for the same in the browser page.
But really what ffmpeg is missing is an expressive language to describe its operation. Something well-structured, like what jq does for JSON.
Weird thing is I got better performance without "-c:v h264_videotoolbox" on latest Mac update, maybe some performance regression in Sequoia? I don't know. The equivalent flag for my windows machine with Nvidia GPU is "-c:v h264_nvenc" . I wonder why ffmpeg doesn't just auto detect this? I get about 8x performance boost from this. Probably the one time I actually earned my salary at work was when we were about to pay out the nose for more cloud servers with GPU to process video when I noticed the version of ffmpeg that came installed on the machines was compiled without GPU acceleration !
[0] https://gist.githubusercontent.com/nielsbom/c86c504fa5fd61ae...
[1] https://gist.githubusercontent.com/jazzyjackson/bf9282df0a40...
Issue with cloud CPU's is that they don't come with any of the consumer grade CPU built-in hardware video encoders so you'll have to go with the GPU machines that cost so much more. To be honest I haven't tried using HW accel in the cloud to have a proper price comparison, are you saying you did it and it was worth it?
The file size is also problematic I've had hardware encodes twice as large as the same video encoded with CPU.
Connect credit card, open a web UI, send the command, the files, and eventually get the output?
You just need to use the hetzner API's to put all your video on a shared drive, write a simple job runner in whatever language you like or even simpler you could write your commands in a text file on the shared drive. Write a simple script to mount the shared drive, look for the job file on machine startup; then have your machine delete itself via hetzner API. Email yourself before that. There, you have your weekend project.
One of the advantages of working with image data is that movies are really just 3d data and as long as all the movies you work with are the same size, if you have enough ram, or use dask, you could basically do this in a couple lines of numpy.
-c:v h264_nvenc
This is useful for batch encoding, when you're encoding a lot of different videos at once, because you can get better encoding throughput.But in my limited experiments a while back, I found the output quality to be slightly worse than with libx264. I don't know if there's a way around it, but I'm not the only one who had that experience.
But for multiple streams or speed requirements, nvenc is the only way to fly.
Edit: relevant docs from ffmpeg, they back up your perception, and now I'm left to wonder how much I want to learn about profiles in order to cut up these videos. I suppose I'll run an overnight job to reencode them from Avi to h264 at high quality, and make sure the scene detect script is only doing copys, not reencoding, since that's the part I'm doing interactively, there's no real reason I should be sitting at the computer while its transcoding.
Hardware encoders typically generate output of significantly lower quality than good software encoders like x264, but are generally faster and do not use much CPU resource. (That is, they require a higher bitrate to make output with the same perceptual quality, or they make output with a lower perceptual quality at the same bitrate.)
For manually cutting up videos, I use LosslessCut, which I think uses ffmpeg under the hood and is really helpful for finding and cutting on keyframes.
Find a complex short scene in your cpu encoded video, extract it, ffprobe it to get average video bitrate, and take the same clip in raw and try gpu accelerated encoding at +20% bitrate. From there, iterate.
For a friend’s use-case that I helped with, +30% video bitrate bump overcame the degraded vquality.
Edit: strangely enough, if memory serves, after the correcting +30% was applied the actual ffprobe bitrates between the videos were very similar, maybe a 10% or less difference. Someone smarter than me can work that logic out.
Hardware encoding is often less configurable and involves greater trade-offs than using sophisticated software codecs, and don't produce exactly equivalent results even with equivalent parameters. On top of that, systems often have multiple hardware APIs to choose from that often different features.
FFMpeg is a complex command-line tool intended for users who are willing to learn its intricacies, so I'm not sure it makes sense for it to set defaults based on assumptions.
Lately, I've been playing around with more esoteric functionality. For example, storing raw video straight off a video camera on a fairly slow machine. I built a microscope and it reads frames off the camera at 120FPS in raw video format (YUYV 1280x720) which is voluminous if you save it directly to disk (gigs per minute). Disks are cheap but that seemed wasteful, so I was curious about various close-to-lossless techniques to store the exact images, but compressed quickly. I've noticed that RGB24 conversion in ffmpeg is extremely slow, so instead after playing around with the command line I ended up with:
ffmpeg -f rawvideo -pix_fmt yuyv422 -s 1280x720 -i test.raw -vcodec libx264 -pix_fmt yuv420p movie.mp4 -crf 13 -y
This reads in raw video- because raw video doesn't have a container, it lacks metadata like "pixel format" and "image size", so I have to provide those. It's order dependent- everything before "-i test.raw" is for decoding the input, and everythign after is for writing the output. I do one tiny pixel format conversion (that ffmpeg can do really fast) and then write the data out in a very, very close to lossless format with a container (I've found .mkv to be the best container in most cases).Because I hate command lines, I ended up using ffmpeg-python which composes the command line from this:
self.process = (
ffmpeg.
input(
"pipe:",
format="rawvideo",
pix_fmt="yuyv422",
s="{}x{}".format(1280, 720),
threads=8
)
.output(
fname, pix_fmt="yuv422p", vcodec="libx264", crf=13
)
.overwrite_output()
.global_args("-threads", "8")
.run_async(pipe_stdin=True)
)
and then I literally write() my frames into the stdin of that process. I had to limit the number of threads because the machine has 12 cores and uses at least 2 at all times to run the microscope.I'm still looking for better/faster lossless YUV encoding.
Which is appropriate. A Unix pipeline is dependent on the order of the components, and complex FFMpeg invocations entail doing something analogous.
>I ended up using ffmpeg-python which composes the command line from this
A lot of people like this aesthetic, but doing "fluent" interfaces like this is often considered un-Pythonic. (My understanding is that ffmpeg-python is designed to mirror the command-line order closely.) The preference (reinforced by the design of the standard library and built-in types) is to have strong https://en.wikipedia.org/wiki/Command%E2%80%93query_separati... . By this principle, it would look something more like
ffmpeg(global_args=..., overwrite_output=True).process_async(piped_input(...), output(...))
where using a separate construction process for the input produces a different runtime type, which also cues the processing code that it needs to read from stdin.As for what's unpythonic: don't care. My applications has code horrors that even Senior Fellows cannot unsee.
Look no further: https://trac.ffmpeg.org/wiki/Encode/FFV1
> I wasn't able to produce results that were convincingly better than libx264
With "-qp 0"? Otherwise, it's not a valid comparison... "-crf 13" is nowhere near lossless (though it might appear so visually).
FFV1 is much better than H264 at lossless compression in my experience. Here's a random sample of a ten second 4K input I had handy (5.5G uncompressed):
h264-ultrafast 1.951s 850M
h264-veryslow 46.528s 715M
ffv1 8.883s 637M
But yeah, if you don't actually require truly lossless data, it's a huge waste.If you are doing processing with intermediate steps you do not want to keep? Ramdisks. Oh yeah. Oh yeah.
When I set up the server, the ramdisk didn't have a way of shrinking when space wasn't needed so had to make sure it doesn't eat up all memory when growing unlimited. I bet it's smarter nowadays.
Ramdisks are so very handy now when you can have much more than 512 kB ram... :-)
e.g. When doing a simple copy, progress status messages upgrade to scientific notation.
https://www.ffmpegbyexample.com/examples/l1bilxyl/get_the_du...
Don’t call two extra tools to do string processing, that is insane. FFprobe is perfectly capable of giving you just the duration (or whatever) on its own:
ffprobe -loglevel quiet -output_format csv=p=0 -show_entries format=duration video.mp4
Don’t simply stop at the first thing that works; once it does think to yourself if maybe there is a way to improve it.I like your solution better!
Yes, I agree. It was decidedly the wrong word to use and the post would undoubtedly have been better without that part. Unfortunately, the edit window had already passed by the time I reread it.
I wonder how small of an LLM you could develop if you only wanted to target creating ffmpeg commands. Perhaps it could be small enough to be hosted on a static webpage where it is run locally?
Now I say this, it seems like there should already be a shell that is also an LLM where you can mix bits of commands you vaguely remember and natural language a bit like Del Boy speaking French...
https://www.warp.dev/blog/how-warp-works
Warp terminal – no more login required (49 days ago) https://news.ycombinator.com/item?id=42247583
Show HN Warp.dev (3 years ago) https://news.ycombinator.com/item?id=30921231
Show HN llmterm https://news.ycombinator.com/item?id=42498901 https://github.com/timschmidt/llmterm
Relevant XKCD https://xkcd.com/1168/
It’s one of the only tools where I reach for a GUI equivalent (Handbrake) by default, unless I’m doing batch processing. There are a few pure ffmpeg GUIs out there as well. There’s just something about working with video that CLI doesn’t work right with my brain for.
If you have a pretty normal use case the Bins (decodebin, transcodebin, playbin) make that pretty easy. If you have a more complex use case the flexibility of the design makes it possible.
It would be cool to see if a TUI tool existed. Something like https://github.com/Twinklebear/fbed but more feature complete.
For reference:
One-liner:
> ffmpeg -loglevel info -f concat -safe 0 -i <(for f in *.mkv; do echo "file '$(pwd)/$f"; done) -c copy output.mkv
Or the method I ended up using, create a files.txt file with each file listed[0]
> ffmpeg -f concat -safe 0 -i files.txt -c copy output.mkv
files.txt
> file 'file 1.mkv' > file 'file 2.mkv' > # list any additional files
$ helpme ffmpeg capture video from /dev/video0 every 1 second and write to .jpg files like img00000.jpg, img00001.jpg, ...
$ helpme ffmpeg assemble all the .jpg files into an .mp4 timelapse video at 8fps
$ helpme ffmpeg recompress myvideo.mp4 for HTML5-friendly use and save the result as myvideo_out.webm
I know there are full blown AI terminals like Warp but I didn't like the idea of a terminal app requiring a login, possibly sending all my commands to a server, etc. and just wanted a script that only calls the cloud AI when I ask it to.It's not a great name and not very discoverable, but there's a lot of very useful ffmpeg-by-example snippets there with illustrated results, and an explanation of what each option in each example does.
So changed the verbosity to trace ffmpeg -v trace -f data -i input.txt -map 0:0 -c text -f data -
---snip-- [dost#0:0 @ 0x625775f0ba80] Encoder 'text' specified, but only '-codec copy' supported for data streams [dost#0:0 @ 0x625775f0ba80] Error selecting an encoder Error opening output file -. Error opening output files: Function not implemented [AVIOContext @ 0x625775f09cc0] Statistics: 10 bytes read, 0 seeks
I was expecting text to be written to stdout? What did I miss?
However, from the same Reddit thread, this works:
ffmpeg -v quiet -f data -i input.txt -map 0 -f data pipe:1
EDIT: just verified the `-c text` approach works on FFmpeg major versions 4 and 5. From FFmpeg 6 onwards, it's broken. The `pipe:1` method works from FFmpeg 5 onwards, so the site should probably be updated to use that instead (also, FFmpeg 5.1 is an LTS release).
especially: point 4 is the final giveaway!
The closest I seem to be able to get is to divide the file size by the file length, add some wiggle room and then split it based on time. Any pointers appreciated.
Can we have a best of HNN and put it on there, or vote on it, or whatever?
Which indeed you have: https://news.ycombinator.com/favorites?id=Over2Chars
There are a number of HN lists: https://news.ycombinator.com/lists
"Best": https://news.ycombinator.com/best is highest voted recent links.
I finally found that six point font "favorite" that appears only once and at the top of the comment section for a topic.
You guys don't make it easy.
there was one time I didn't use pyaudio correctly so I was using this process where ffmpeg can stitch multiple audio files together into one passed in as an array cli argument, crazy
Anyway long story short, instead of the usual terrifying inline ffmpeg filter tangle. the filter can be structured however you want and you can include it from a dedicated file. It sounds petty, but I really think it was the thing that finally let me "crack" ffmpeg
The secret sauce is the "/", "-/filter_complex file_name" will include the file as the filter.
As I am pretty happy with it I am going to inflect it on everyone here.
In motion_detect.filter
[0:v]
split
[motion]
[original];
[motion]
scale=
w=iw/4:
h=-1,
format=
gbrp,
tmix=
frames=2
[camera];
[1:v]
[camera]
blend=
all_mode=darken,
tblend=
all_mode=difference,
boxblur=
lr=20,
maskfun=
low=3:
high=3,
negate,
blackframe=
amount=1,
nullsink;
[original]
null
And then some python glue logic around the command ffmpeg -nostats -an -i ip_camera -i zone_mask.png -/filter_complex motion_display.filter -f mpegts udp://127.0.0.1:8888
And there you have it, motion detection while staying in a single ffmpeg process, the glue logic watches stdout for the blackframe messages and saves the video.explanation:
"[]" are named inputs and outputs
"," are pipes
";" ends a pipeline
take input 0 split it into two streams "motion" and "original". the motion stream gets scaled down, converted to gbrp(later blends were not working on yuv data) then temporally mixed with the previous two frames(remove high frequency motion), and sent to stream "camera". Take the zone mask image provided as input 1 and the "camera" stream, mask the camera stream, find the difference with the previous frame to bring out motion, blur to expand the motion pixels and then mask to black/white, invert the image for correct blackframe analyses which will print messages on stdout when too many motion pixels are present. The "original" stream get sent to the output for capture.
One odd thing is the mpegts, I tried a few more modern formats but none "stream" as well as mpegts. I will have to investigate further.
I could, and probably should have, used opencv to do the same. But I wanted to see if ffmpeg could do it.
Currently looking for an FFmpeg related job https://gariany.com/about
While as a concept, I absolutely love "X by Example" websites, this one seems to make some strange decisions. First, the top highlighted example is just an overly complicated `cat`. I understand that it's meant to show the versatility of the tool, but it's basically useless.
Then below, there's 3 pages of commands, 10 per page. No ordering whatsoever in terms of usefulness. There looks like there's an upvote but it's actually just a bullet decoration.
There's also a big "try online" button for a feature that's not actually implemented.
All in all, this is a pretty disappointing website that I don't think anyone in this thread will actually use, even though everyone seems to be "praising" it.
The build system randomize an example to showcase on the homepage, I actually find it funny that its different example every time.
Regarding the upvote system. This is a static documentation website. I have created a crazy unique solution to have upvotes working but the website had literally zero traffic in years, so I guess that wasn't the most important feature to focus on.
Sorry to disappoint, I'm doing it completely voluntary - happy to get any help here: https://github.com/eladg/ffmpeg-by-example
You push the input files, the command, and fetch the output when done.
Maybe if it was really cheap and the servers could process the job in a small fraction of the time it takes locally. Otherwise, I'd just run it locally.
If you built a ui that made it easier to use as part of the offering, it'd make sense for a lot of people.
Why you think that an UI would make sense?
An UI like what? Something to drive the user toward what it wants to do?
People don't have the goal "run ffmpeg", they have a goal like "transcode and host videos". See eg. mux.com on here recently.
> cost-plus
This is an extremely non-startup pricing model.
I am looking explicitly for people who want to "run ffmpeg"
I don't have the skills, nor the capital, to build a solution for the whole market of people that want to mess with videos.
Just for reference Mix raised 177M since 2016.
The goal is not to run a startup, but a small business with recurrent business.
Of course there are a lot of way to approach this.
One would be an API sold at cost plus (which is the closest to my skillset.)
Building on top of that would be trivial for more pay-for-value product.
Right now, I am looking to normalize some audio without using ffmpeg-normalize, a popular Python package. Nothing against it on a personal level, I just ... want to know what is going on, and it's a lot of files and lines of code to do what is basically a two-pass process.
I have a growing interest in metadata and that's also a case which I do not find is often well-addressed.