Skip Navigation

Uses for local AI?

Im using Ollama on my server with the WebUI. It has no GPU so its not quick to reply but not too slow either.

Im thinking about removing the VM as i just dont use it, are there any good uses or integrations into other apps that might convince me to keep it?

55

You're viewing a single thread.

55 comments
  • I use the Continue VS Code plugin with Ollama to use a couple of different models (deepseek-coder-v2 & starcoder2) to recreate a local only Github Copilot type experience for coding. This is on an M1 Apple Silicon though. For autocomplete the generation needs to be pretty brisk - I'm not sure how that would go in a VM without a GPU.

    • How well does the M1 chip keep up? What size models are you running with it? Interested in getting an M1 laptop and I am curious.

      • starcoder2:latest       	f67ae0f64584	1.7 GB	3 days ago 	
        phi3:latest             	d184c916657e	2.2 GB	3 weeks ago	
        deepseek-coder-v2:latest	8577f96d693e	8.9 GB	3 weeks ago	
        llama3:8b-instruct-q8_0 	1b8e49cece7f	8.5 GB	3 weeks ago	
        dolphin-mistral:latest  	5dc8c5a2be65	4.1 GB	3 weeks ago	
        codeqwen:latest         	df352abf55b1	4.2 GB	3 weeks ago	
        llama3:latest           	365c0bd3c000	4.7 GB	4 weeks ago
        

        I mostly use starcoder2 with Continue for code autocomplete, the big deepseek coder is a bit slow (I can feel it thinking), but it and the regular llama3 are good for chatbot type programming questions.

        I don't really have anything to compare the M1 performance to. I guess the 8GB models output text a little slower than the web versions of the same models, and the 4GB ones about the same. Using ollama in the terminal, there's sometimes a 0.5-2 second pause before it starts outputting. Not with phi3 though - it's surprisingly snappy for the quality of answers.

55 comments