Late-Night Thread Exhaustion

It's 1 AM and somewhere inside the glowing rectangles of my homelab, Frigate is having what can only be described as an existential crisis. The feeds have all gone black. The logs are screaming. And I'm about to go down a rabbit hole that would make anyone question their recreation choices. I'm reminded of a meaningful quote.

"Nothing good happens after midnight." - Mom

The Setup (AKA: What Not To Do)

I've got this lovely RTX 4060 with a mere 8GB of VRAM. (which, in GPU terms, is basically a studio apartment in Silicon Valley—technically livable but aggressively cozy). On one side, I've got Frigate watching 10 cameras. On the other side, I've got Ollama running a 7B model consuming 4-6GB of VRAM.

They are not getting along.

But here's the thing: the VRAM contention wasn't even the main problem. (Plot twist!)

The First Symptom: Black Screens and "No Stream Detected"

Around 00:41, my Frigate dashboard started looking like a dead broadcast—all those beautiful live camera feeds replaced with fuzzy static. The error logs were unhelpful: "go2rtc streams failed to initialize."

I did what any reasonable person does at 1 AM: I blamed Ollama. And, to be fair, I was right. But only partially right.

The GPU had 4-6GB locked up by Ollama's 7B model, while Frigate's detection pipeline was trying to squeeze detection models into the remaining 2GB. When both services tried to process video simultaneously, the streams collapsed. The VRAM contention was real enough to kill the streams, but this wasn't the killer blow.

But wait—there's more. (There's always more.)

The Second Symptom: Recording Cache Warnings and Cascading Failure

Around 00:42, the logs started showing something more sinister:

WARNING: Too many unprocessed recording segments in cache for cam8-CX410

Then it got worse.

The system started screaming about thread exhaustion:

RuntimeError: can't start new thread
pthread_create failed: Resource temporarily unavailable

All 10 cameras crashed simultaneously at 00:51:47. Within seconds, the entire system tried to restart all of them at once. The result? A segmentation fault at 00:42:38. Frigate just gave up.

The logs looked like this:

ERROR: Could not open codec h264, error -11 (EAGAIN - the OS is refusing resources)
ERROR: Failed to initialize VideoCapture - everywhere - on every camera

The Realization: This Isn't About VRAM

I ran nvidia-smi. The GPU showed healthy. I ran it again. Still healthy.

But the logs? They were full of pthread_create failed: Resource temporarily unavailable.

That error code -11 isn't a GPU error. That's the OS refusing to allocate new threads.

That's the moment my brain actually engaged.

The Investigation: Down the Linux Rabbit Hole

I checked the thread limits:

cat /proc/sys/kernel/threads-max

6,300 threads.

On a system with 58GB of RAM.

Let that sink in. Fifty-eight gigabytes. And yet the kernel was capped at 6,300 threads. That's a limit you'd expect on a system from 2005.

The calculation for this limit is:

threads-max = RAM_pages / (8 × THREAD_SIZE / PAGE_SIZE)

With 58GB, I should've had ~500,000 threads. With 6,300? That suggests the system saw 4GB at boot time.

Plot Twist: The VM RAM Boost

I originally provisioned my VM with 4GB. Inner dialog,don't some time in the past: "Hey, I only need 4 GB of RAM to install the OS, I'll hot add some more later." At boot, the kernel calculated threads-max based on what it saw—4GB. So: 6,300 threads.

Then someone upgraded the allocation to 64GB. But the kernel didn't recalculate threads-max. I still don't know why that wasn't overwritten any one of the half a dozen times I rebooted this honker of a VM since bumping up the RAM. I

But there's another layer to this.

The Second Culprit: systemd's DefaultTasksMax

I dug into /etc/systemd/system.conf and found this:

#DefaultTasksMax=15%
DefaultTasksMax=945

945 threads per service. That's 15% of 6,300.

So even though the system technically had more capacity, every single service running under systemd was capped at 945 threads. Frigate managing 10 camera streams needs ~50-100 threads per camera for FFmpeg, detection, recording, and audio processing.

Frigate needed: ~1,000-2,000 threads Frigate got: 945 threads Result: Exactly what we saw. System failure.

The Cascade: Why Everything Broke at Once

Here's what unfolded:

Ollama loads model → 6-7GB VRAM consumed
Frigate tries to process video → Fights for remaining VRAM, threads starting to exhaust
Go2rtc streams fail → Frigate tries to restart them
Mass restart of all 10 cameras simultaneously → THREAD SPIKE
OS runs out of threads → No more pthread_create calls accepted
Every camera fails to restart → Each failure generates more restart attempts
Positive feedback loop → Cascading failure
Segmentation fault → Frigate dies.

The Fix: A Series of Ugly systemd Edits

I needed to fix three things:

Thing 1: Disable audio detection

Audio detection was running on all 9 cameras and eating threads. In the Frigate config:

audio:
  enabled: false

That alone saved ~100 threads.

Thing 2: Fix the DefaultTasksMax

sudo nano /etc/systemd/system.conf

Changed:

#DefaultTasksMax=15%

To:

DefaultTasksMax=65535

Then reloaded:

sudo systemctl daemon-reexec
sudo systemctl daemon-reload

Thing 3: Force threads-max to Match Reality

echo "kernel.threads-max=500000" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

And then:

sudo reboot

The Resurrection Test

After reboot:

cat /proc/sys/kernel/threads-max
# 500000 ✓
systemctl show --property DefaultTasksMax
# 65535 ✓

Started Frigate and watched the thread count:

watch 'docker exec frigate ps -eLf | wc -l'

Stabilized at around 1,200 threads instead of crashing at 945.

All 10 cameras came up. All of them. Even cam8, which had been having its own special problems.

The Blame Game: Who's Actually At Fault?

Ollama? Partially guilty. The 7B model hogging 6-7GB VRAM was throwing fuel on the fire.
Frigate? Innocent. It was doing its job until it ran out of threads.
The Admin? The real villain. Created a 4GB VM without documentation, then upgraded it without updating the kernel parameters. ... ;-( Me.
systemd's 15% default? Also guilty. That default works fine for desktop systems. It's terrible for container workloads.
My own incompetence at 1 AM? Definitely in the mix... a lot.

The Lesson: A Checklist For The Forgetful

Next time I spin up a new Ubuntu VM, I'm checking after I set the final RAM amount, or every time I change it. And, I guess RAM hotplug doesn't work as well as I would have hoped.

# 1. What does the system think it has?
free -h
cat /proc/meminfo | grep MemTotal

# 2. What's the kernel thread limit?
cat /proc/sys/kernel/threads-max
# Should be roughly: RAM_GB * 8,000 to 10,000

# 3. What's systemd limiting services to?
systemctl show --property DefaultTasksMax
# Should be at least 30,000+ for container workloads

# 4. Are there GRUB overrides messing with me?
cat /proc/cmdline

If any of those numbers look suspect, fix them immediately. Don't wait. Don't tell yourself you'll do it later. Future You will regret that decision at 2 AM.

The VRAM Situation: Still Needs Fixing

To be clear, the GPU memory contention was also a real problem. With the thread exhaustion fixed, Frigate is now stable. But to resurrect Ollama without causing the same black-screen go2rtc collapse:

OLLAMA_KEEP_ALIVE=5m          # Unload model after idle time
OLLAMA_NUM_GPU_LAYERS=25      # Hybrid CPU/GPU inference

This lets both services coexist on the same GPU. Ollama doesn't hog all the VRAM. Frigate gets breathing room. It's still an 8GB GPU trying to do 16GB of work, but we're working with what we've got.

The Real Problem: Configuration Debt

The issue wasn't any single component. It was three layers of configuration problems held stacked on top of each other:

A VM provisioned with assumptions that no longer hold
Default systemd limits are designed for a different use case
Two memory-intensive applications competing for the same GPU

Any one alone is manageable. All three at once caused cascading failure.

This is why, going forward, I will now obsessively check thread limits on every system. Because at 2 AM, staring at logs full of pthread_create failed, wondering if restarting everything will actually help... that's when you realize infrastructure isn't glamorous. It's just a series of configuration files, carefully balanced, waiting for someone to forget a crucial detail.

tl;dr:

Frigate + Ollama on 8GB GPU = VRAM contention ✓ (Real, fixable with OLLAMA_KEEP_ALIVE and GPU layer limits)
threads-max stuck at 6,300 on 58GB RAM = system fragility ✓ (The actual killer, fixed with sysctl)
DefaultTasksMax at 15% = 945 threads per service ✓ (Fixed with systemd config)
All three together = 2 AM segmentation fault ✓ (The adventure I didn't want)

And that's why I'm writing this at 3 AM instead of sleeping like a functional human being. Infrastructure troubleshooting is a journey, and sometimes that journey takes you through Linux kernel parameters and back again, but we come out wiser and more experienced for it.

(Now, if you'll excuse me, I have some cameras to watch.)