Research platform for micro inference server for SL5
μInference creates a minimal bootable Linux ISO with embedded LLaMA model inference capabilities. It packages llama2.c into a custom Linux distribution that boots directly into an inference environment.
sudo apt update && sudo apt install -y \
build-essential gcc g++ make git wget curl \
xorriso mtools dosfstools \
flex bison bc kmod cpio \
libelf-dev libssl-dev \
libncurses-dev \
qemu-system-x86 \
lld nasm# Clone and build
git clone https://github.com/luiscosio/muinference
cd muinference
make iso
# Test in QEMU
make boot_isoOnce booted, you can interact with the LLaMA model using the talk command:
# Ask a question
talk "What is the meaning of life?"
# Tell a story
talk "Tell me a story about a robot"make help # Show all targets
make iso # Build ISO (default)
make boot_iso # Test in QEMU
make clean # Clean build artifacts
make distclean # Remove all sourcesmake fast-build # Maximum parallelization
make JOBS=8 iso # Use 8 parallel jobsCreates l2e_boot/muinference.iso (~50MB) containing:
- Linux kernel 6.5 with L2E module
- llama2.c with stories15M model (15M parameters)
- Minimal userspace (musl + busybox)
- Linux kernel: GPL v2
- llama2.c: MIT
- Project: MIT