TUNDRA // NEXUS

Mission Control
Curated Links/2026-02-19-finally-found-local-llm-want-use-coding
🟒

I finally found a local LLM I actually want to use for coding

πŸ”—xda-developers.com
February 19, 2026
SIGNAL7/10
#ai #dev

🟒 READ | ⏱ 8 min | πŸ“‘ 7/10 | 🎯 Developers, security researchers, LLM enthusiasts

TL;DR

After running local LLMs as a principle-driven exercise for years, Adam Conway found Qwen3-Coder-Next genuinely practical for daily coding and security research. The secret: hardware (128GB unified VRAM), smart architecture (sparse MoE), and seamless Claude Code integration via a local vLLM endpoint.

Signal

  • Hardware: Lenovo ThinkStation PGX ($3K) with NVIDIA GB10 Grace Blackwell Superchip; 128GB unified LPDDR5x memory eliminates PCIe bottleneck that plagues discrete GPUs.
  • Model: Qwen3-Coder-Next (80B params, 3B active via MoE), runs at ~46GB (Q4_K_M) or ~85GB (Q8_0); supports 256K native context, achieves 170K usable via Gated DeltaNet hybrid attention (75% linear + 25% full).
  • Setup: Docker + vLLM on DGX OS; Claude Code points to local endpoint via env vars (no proxy needed); zero API latency, rate limits, or usage costs.

What They're NOT Telling You

The "$3,000 hardware" barrier makes this inaccessible to most hobbyists, though the author mentions Qwen3-Coder-Next can run on lower VRAM systems via MoE offloading (16-24GB). Privacy benefits are framed as "nice-to-have" here, but for anyone handling NDAs, binaries, or sensitive reverse engineering, local execution isn't optionalβ€”it's compliance.

Trust Check

Factuality βœ… | Author Authority βœ… | Actionability βœ