How to use --tools all #22132

alan-l · 2026-04-19T20:47:44Z

alan-l
Apr 19, 2026

Can someone help me out in understanding how to use --tools all on llama-server.exe?

I've tried searching all sections of this project, but I'm stuck on how to proceed. I've even tried to add http://127.0.0.1:8080 and http://127.0.0.1:8080/tools as MCP endpoints (with and without "use llama-server proxy" enabled), but it didn't work. For some reason, I'm not seeing how to get my LLM to do things like read and write local files, or how to point to which directory it has permission to do these things. I'm able to get mcp.exa.ai to work locally, but that's all I was able to get working.

Hope someone can point me in the right direction!

Thank you.

PremiumPerfume · 2026-04-20T09:32:49Z

PremiumPerfume
Apr 20, 2026

I’ve worked with similar setups while integrating LLM tooling into internal workflows at my company and the tools all flag on llama-server.exe only works when the server is configured with valid tool definitions in the tools directory. The flag doesn’t automatically enable file access unless the tool JSON files explicitly define those capabilities.

2 replies

alan-l Apr 20, 2026
Author

Thanks for the hint, @PremiumPerfume. Getting started is the hard step. Could you maybe give me a clue on how to do this? I just need something tangible to get started with.

Thank you!

chantravis0 Apr 24, 2026

That is my issure.
I want to know how to use the inner tools function:
--tools TOOL1,TOOL2,... experimental: whether to enable built-in tools for AI agents - do not
enable in untrusted environments (default: no tools)
specify "all" to enable all tools
available tools: read_file, file_glob_search, grep_search,
exec_shell_command, write_file, edit_file, apply_diff

crystalsighting · 2026-04-25T12:49:18Z

crystalsighting
Apr 25, 2026

curl -X POST http://127.0.0.1:8080/tools -H "Content-Type: application/json" -d '{"tool":"exec_shell_command","params":{"command":"ls -la"}}'

0 replies

ibondarenko1 · 2026-05-09T21:57:26Z

ibondarenko1
May 9, 2026

the two earlier answers are misleading you, neither matches what --tools all actually does on llama-server.

--tools all on llama-server is purely a chat-template / function-calling switch. it tells the server to apply the model's tool-call jinja template to the conversation so the LLM can emit structured tool_call JSON in its responses. it does NOT add a /tools execution endpoint, and llama-server has no built-in shell or filesystem access. the curl example above with exec_shell_command against /tools isn't a real endpoint, that request will 404.

what you actually need for read/write local files:

llama-server stays the model backend (openai-compatible). you separately run an MCP client like claude desktop, continue, cline, or cursor, point it at llama-server as the LLM, and add an MCP server for filesystem access on the side. for filesystem the canonical one is @modelcontextprotocol/server-filesystem:

{
  "mcpServers": {
    "fs": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "C:/work/some-allowed-dir"]
    }
  }
}

flow: you talk to your MCP client, it asks llama-server (your model) to plan, the model emits a tool_call for read_file, the client routes that to the filesystem MCP server, which does the read inside the allowed dir, returns the content to the client, the client feeds it back to the model. the directory you pass to server-filesystem is the only path the model can see, that's your sandbox.

--tools all becomes useful in that flow because it makes sure the model actually emits tool_calls in the format the client expects. without it (or --jinja with the model's own template), the model might output tool calls as prose and the client won't pick them up.

if you want to skip MCP entirely and have llama-server run shell commands itself, that doesn't exist in llama.cpp upstream. you'd need a tiny proxy between your client and llama-server that intercepts tool_calls, executes them, and returns results, but that's home-rolled. honestly the MCP path is exactly what the ecosystem standardized for this and it's less code than the alternative.

0 replies

HermestoAizales · 2026-05-24T05:42:37Z

HermestoAizales
May 24, 2026

Update 2025-05: --tools all has changed since the original comment. llama-server now has actual built-in tools (read_file, write_file, exec_shell_command, grep_search, file_glob_search, edit_file, apply_diff, get_datetime) that run directly in the server — no separate MCP client needed. These are exposed through the WebUI agent endpoint, not via plain HTTP on /tools (the curl example in the comment above will still 404). The MCP route (Claude Desktop, Cursor + server-filesystem) is still the better choice for production setups with proper sandboxing — the built-in tools have no directory restriction and execute commands directly on the host (--tools all = only use in trusted environments).

3 replies

alan-l May 24, 2026
Author

I think they turned on that feature shortly after I made this post. I was jumping for joy when I discovered this inclusion, until I came across a very nasty bug in the tool "edit_file". That bug is currently documented in #23246 , and has been confirmed by at least one other user.

For now, my workaround is to disable that tool in the settings, so the AI will work around that issue by calling "write_file" when it wants to edit/update a file. The problem with that is that it has to read the entire file, edit the file in-memory, then write that file as a new file. I think you can visualise the magnitude of the increase in token consumption because of this workaround.

Hoping someone on the dev team has the time to see if this issue is a quick fix or not. It is an interesting and sizable show-stopper...

crystalsighting May 24, 2026

Curl command worked fine for me, listed all my files in the directory as llama.cpp user when I posted that unless that has changed in latest commits.

I'm not sure of actual usefulness of this other than speed since baked right into llama-server tools endpoint. I guess you could code an mcp server to execute those endpoints, then add it to home assistant so an, "Okay Nabu, read file.txt to me works, modifying HA system prompt to let it know about the tools for those that like using their voice.

However I question the extra tcp socket call vs just executing a simple python function of your own doing same thing.

I guess it's only really useful if you don't want to bother coding those functions and just parse the JSON returned instead, minimalist coders :)

I think might only be useful for short commands that return response right away, saying, "Okay Nabu, recompile llama.cpp for me", you might want to do your own function for that to git pull and recompile it with longer timeouts.

It's nice to have options though, for last example I'd probably use python async to execute commands, then make api call to home assistant to notify my phone if completely successfully, or broadcast the message on speakers in house.

For those that got 404 errors, substitute 127.0.0.1 with your own internal ip unless your on same VM with llama.cpp.

alan-l May 25, 2026
Author

Curl command worked fine for me, listed all my files in the directory as llama.cpp user when I posted that unless that has changed in latest commits.

I'm not sure of actual usefulness of this other than speed since baked right into llama-server tools endpoint. I guess you could code an mcp server to execute those endpoints, then add it to home assistant so an, "Okay Nabu, read file.txt to me works, modifying HA system prompt to let it know about the tools for those that like using their voice.

However I question the extra tcp socket call vs just executing a simple python function of your own doing same thing.

I guess it's only really useful if you don't want to bother coding those functions and just parse the JSON returned instead, minimalist coders :)

I think might only be useful for short commands that return response right away, saying, "Okay Nabu, recompile llama.cpp for me", you might want to do your own function for that to git pull and recompile it with longer timeouts.

It's nice to have options though, for last example I'd probably use python async to execute commands, then make api call to home assistant to notify my phone if completely successfully, or broadcast the message on speakers in house.

For those that got 404 errors, substitute 127.0.0.1 with your own internal ip unless your on same VM with llama.cpp.

Your curl command is great, and it is how I was testing these functions before the devs enabled built-in tools. Do keep in mind that in their documentation they say explicitly to not use these endpoints directly as they are subject to change and removal.

Using curl commands still incur the extra step of feeding those results back into the LLM, either via custom code or through the web ui. When the devs enabled built-in tools, that extra step is not needed, and is probably way more efficient than external curls.

Regardless, the current problem for the buggy edit_file still remains and is a sizeable showstopper.

ibondarenko1 · 2026-05-25T03:25:23Z

ibondarenko1
May 25, 2026

Quick follow-up since the edit_file bug (#23246) is now the showstopper, a hybrid setup works around it without losing built-in speed.

Keep --tools all enabled for read_file, grep_search, write_file, and exec_shell_command (those are fine), and route just edit_file through @modelcontextprotocol/server-filesystem. Cursor, Continue, and Cline all handle the mixed setup the client picks which backend each tool_call goes to. You skip the read-then-rewrite token tax and you also get directory sandboxing for the edit path.

One thing worth flagging given exec_shell_command is enabled by --tools all: the built-in tools have no directory restriction and run as the llama-server process. If the model gets prompt-injected via any content it reads, it can rm -rf or exfiltrate anything the process can reach. Run inside a container or VM with a bind-mounted work directory if this is more than a scratch box.

0 replies

AlexData-Hawkhill · 2026-06-11T02:48:31Z

AlexData-Hawkhill
Jun 11, 2026

Great coding, you Lllama guys! I love this project! CLI tool is awesome, SERVER tool is even more awesome! Love it!

However, I struggle with one thing in both ver b9584 and b9585 (newest per yesterday), how do I use tools? I have added "--tools all" to the llama-server, via the script I use to start it (on my Linux). In the GUI I see the tools in the "+" next to the message field. I can enable and disable tools there (like the edit text mentioned in comments in here).

But, when I ask the LLM loaded (qwen3.6 decent with tools) to "ls this folder" or "edit ~/test.txt" or "curl some-page-url.com" the best answer it can give me is; here is the command I would/will run, then it states the command and nothing happens, or the model gives up waiting and invents/hallucinates a 404 for the curl". Specifically how do I ask "any model" to "cd to this folder xyz" and "ls that folder" and "tell me the filenames you see". What is missing, why does the tools not do anything?

I've made my own python GUI that does this, however this GUI and the whole Llama.cpp is way better than all my personal projects, but on this one thing, it is really hard to get to even do anything. So... How? Step by step (ELI5 for those who knows), and please don't skip steps like "you also need to have x.json file in y folder, and also, i forgot to tell you, it can only be done with an MPC server".

Please provide a working example that a noob can follow (yes, I 'll be that noob, no problem, i volunteer! he he!) I've tested and read and tested, but that one I struggle with., even though i can code a full GUI in python, enabling LLMs that has not been trained to use tools, to use tools. So clearly with C code, the same must be possible, I just don't get "how to activate it, so that the model who is trained on tools, can use a tool".

3 replies

jboero Jun 12, 2026

Second that. I love how the tools feature turned out. With great power, great responsibility. I think I'll start running llama-server strictly in containers now. 🫣 Just a matter of time before someone wipes out their machine by mistake.

@AlexData-Hawkhill I'm pretty sure these file tools are for coding agents. So for vs code plugins like Cline, etc. I don't think you're expected to use this in the browser. Date/time tools are great for the browser though.

AlexData-Hawkhill Jun 14, 2026

Second that. I love how the tools feature turned out. With great power, great responsibility. I think I'll start running llama-server strictly in containers now. 🫣 Just a matter of time before someone wipes out their machine by mistake.

@AlexData-Hawkhill I'm pretty sure these file tools are for coding agents. So for vs code plugins like Cline, etc. I don't think you're expected to use this in the browser. Date/time tools are great for the browser though.

Oh, ok, when I made my own CLI based chat program, I gave ANY LLM the option to use tools, even those not trained for it. With only tiny adjustments in the system prompt, any LLM can learn how to use a tool. I even let them use tools for multiple rounds. So I was assuming this tool section would work in a similar way, but apparently it is more for code plugins and IDE, according to what you answered me. Well, hopefully they make it possible to also edit a file, while chatting via the llama-server!
Thanks for your quick reply! 😄

jboero Jun 14, 2026

Oh actually that does work also. That's why tools are disabled by default. When using an agent like Cline I believe tools are unnecessary because it will do the same actions via the client over API.

The simple test of whether a model can use the tools is asking it what time it is. Llama-server will drop a prompt asking the user to approve use of a tool to get the current date and time. The system prompt template injects the tools enabled in each conversation and how to use it. I believe grepping a file is also a standard tool option. I've had good luck with Qwen and most tools but you won't actually need them for Cline.

Actually it looks like you're running Windows (.exe?) that may be why. Tools like grep and date installed??

How to use --tools all #22132

Uh oh!

Replies: 6 comments · 8 replies

Uh oh!

Uh oh!

alan-l Apr 20, 2026 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alan-l May 24, 2026 Author

Uh oh!

Uh oh!

Uh oh!

alan-l May 25, 2026 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 6 comments 8 replies

alan-l Apr 20, 2026
Author

alan-l May 24, 2026
Author

alan-l May 25, 2026
Author