Show HN: VimLM – A Local, Offline Coding Assistant for Vim

github.com

92 points by JosefAlbers 7 days ago

VimLM is a local, offline coding assistant for Vim. It’s like Copilot but runs entirely on your machine—no APIs, no tracking, no cloud.

- Deep Context: Understands your codebase (current file, selections, references). - Conversational: Iterate with follow-ups like "Add error handling". - Vim-Native: Keybindings like `Ctrl-l` for prompts, `Ctrl-p` to replace code. - Inline Commands: `!include` files, `!deploy` code, `!continue` long responses.

Perfect for privacy-conscious devs or air-gapped environments.

Try it: ``` pip install vimlm vimlm ```

[GitHub](https://github.com/JosefAlbers/VimLM)

toprerules 7 days ago

Awesome. AI isn't making Vim less relevant, it's now more relevant than ever. When every editor can have maximum magic with the same model and LSP, why not use the tool that lets you also review AI generated diffs and navigate at lightning speed. Vim is a tool that can actually keep up with how fast AI can accelerate the dev cycle.

Also love to see these local solutions. Coding shouldn't just be for the rich who can afford to pay for cloud solutions. We need open, local models and plugins.

  • JosefAlbers 7 days ago

    Thanks! I totally agree. I’m looking at ways to further tighten the pairing between Vim’s native tools and LLMs (like with :diff and :make/:copen to run the code, feed errors back to the LLM, then apply the fixes, etc). The catch is model variability—what works for Llama doesn’t always work with R1 because of formatting/behavior quirks, and vice versa. Finding a common ground for all models is proving tricky.

elliotec 7 days ago

Why does it need an Apple M-series chip? Any hope for it getting on an intel chip and using it with Linux?

  • woodson 7 days ago

    It uses MLX (https://github.com/ml-explore/mlx), Apple’s ML framework, for running LLMs.

    • pk-protect-ai 7 days ago

      Why people tend to nail some stuff into their products?

      We have been talking about the AI revolution for several years already, and yet there is no IDE or plugin for VS Code that supports multiple OpenAI compatible endpoints. Some, like Cody, do not even support "private" LLMs other than the ollama endpoint on localhost. Cursor supports only one endpoint for OpenAI API compatible models.

      I made a custom version of ChatGPT.nvim for myself to be able to use models I like (mostly removing hardcoded gpt-3), though I dropped it because then I needed to invest time into maintaining and improving this version for myself instead of doing my job.

      I'd like to run several specialized models with a vLLM engine and serve them at different endpoints, and then I'd like an IDE to be able to use these specialized LLMs for different purposes. Does anyone know a vim/neovim/vscode plugin that supports several OPENAI_API_HOST endpoints?

      For now, this is only possible with agent frameworks, but that's not really what I need.

  • throwaway314155 7 days ago

    Not OP but it presumably uses an open LLM that won't run in a timely manner without being on a faster computer.

ZYbCRq22HbJ2y7 7 days ago

What is a good method for sandboxing models? I would like to trust these projects, but downloading hard-to-analyze arbitrary code and running it seems problematic.

  • godelski 7 days ago

    Probably nspawn[0]. Think of it like chroot on steroids and not as heavy as docker. You can run these containers in an empirical mode, so modifications are not permanent. Like typical systemd you can also limit read/write access, networking, and anything else you want. This can even include things like limiting commands and all that. So you can make the program only able to run in its scope, only read, and only use a very limited command set.

    Not the most secure thing, but you can move up to a VM, then probably want a network gaped second machine if you're seriously concerned but not enough to go offsite.

    [0] https://wiki.archlinux.org/title/Systemd-nspawn

  • heyitsguay 7 days ago

    The attack surface area for local LLMs is much smaller than almost any program that you would download. Make sure you trust whatever LLM execution stack is being used (apparently MLX here? I'm not familiar with that one specifically), and then the amount of additional code associated with a given LLM should be tiny - most of it is a weight blob that may be tough to understand but can't really do anything nefarious, data just passes through it.

    Again, not sure what MLX does but c.f. the files for DeepSeek-R1 on huggingface: https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main

    Two files contain arbitrary executable code - one defines a simple config on top of a common config class, the other defines the model architecture. Even if you can't verify yourself that nothing sneaky is happening, it's easy for the community because the structure of valid config+model definition files is so tightly constrained - no network calls, no filesystem access, just definitions of (usually pytorch) model layers that get assembled into a computation graph. Anything deviating from that form is going to stand out. It's quite easy to analyze.

  • kennysoona 7 days ago

    Running it in a podman/docker container would be more than sufficient and is probably the easiest approach.

thor_molecules 7 days ago

Consider exposing commands that the user can then assign to their own preferred keybindings instead of choosing for them

dbacar 7 days ago

a good update for an editor that cant handle indenting out of the box!