In this article, we will see how to install vLLM on Linux using 4 easy steps. vLLM is a fast and easy-to-use library for optimized inference engine for running large language models (LLMs) efficiently. It enables fast, memory-efficient, and high-throughput inference using techniques like PagedAttention and continuous batching. It has State-of-the-art serving throughput with Optimized CUDA kernels, including integration with FlashAttention and FlashInfer.
It provides seamless integration with popular Hugging Face models. It provides tensor parallelism and pipeline parallelism support for distributed inference. It also have Prefix caching and Multi-lora support. It support NVIDIA GPUs, AMD CPUs and GPUs, Intel CPUs and GPUs, PowerPC CPUs, TPU, and AWS Neuron. It is easy to install and use in almost all popular platform including linux systems.
How to Install vLLM on Linux Using 4 Easy Steps
Also Read: How to Install and Setup MinIO Object Storage on Ubuntu or Debian Linux
Step 1: Prerequisites
a) You should have a running Linux
Server(In my case, I am using Ubuntu 22.04 LTS
System).
b) You should have Python version 3.10
or above installed in your System.
c) You should have Pip
utility available in your System.
d) Minimum Hardware Requirements:-
- RAM: At least 16GB (More for larger models)
- GPU: NVIDIA GPU with Ampere (A100), Hopper (H100), or Turing (V100, RTX 30/40 series)
- CUDA Version: You should have atleast version 12.1 or above.
- NVIDIA Driver: You should have atleast version 470 or above.
Step 2: Install vLLM
To install the vLLM package from PyPI (Python Package Index), run pip install vllm
command as shown below. Below command would download and install vllm package along with all its required dependencies.
Ubuntu-Server@ubuntu:~$ pip install vllm
Collecting vllm
Downloading vllm-0.7.3-cp38-abi3-manylinux1_x86_64.whl (264.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 264.6/264.6 MB 996.7 kB/s eta 0:00:00
Collecting depyf==0.18.0
Downloading depyf-0.18.0-py3-none-any.whl (38 kB)
Collecting requests>=2.26.0
Downloading requests-2.32.3-py3-none-any.whl (64 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.9/64.9 KB 4.7 MB/s eta 0:00:00
Collecting pydantic>=2.9
Downloading pydantic-2.10.6-py3-none-any.whl (431 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 431.7/431.7 KB 1.8 MB/s eta 0:00:00
Collecting msgspec
Downloading msgspec-0.19.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (211 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.6/211.6 KB 5.5 MB/s eta 0:00:00
Collecting blake3
Downloading blake3-1.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (376 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 376.4/376.4 KB 2.3 MB/s eta 0:00:00
Collecting xformers==0.0.28.post3
Downloading xformers-0.0.28.post3-cp310-cp310-manylinux_2_28_x86_64.whl (16.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.7/16.7 MB 5.6 MB/s eta 0:00:00
Collecting aiohttp
Downloading aiohttp-3.11.13-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 6.4 MB/s eta 0:00:00
Collecting numba==0.60.0
Downloading numba-0.60.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.7/3.7 MB 4.9 MB/s eta 0:00:00
Collecting partial-json-parser
Downloading partial_json_parser-0.2.1.1.post5-py3-none-any.whl (10 kB)
Collecting torchaudio==2.5.1
Downloading torchaudio-2.5.1-cp310-cp310-manylinux1_x86_64.whl (3.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.4/3.4 MB 4.6 MB/s eta 0:00:00
Collecting pyzmq
Downloading pyzmq-26.2.1-cp310-cp310-manylinux_2_28_x86_64.whl (874 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 874.2/874.2 KB 2.9 MB/s eta 0:00:00
Requirement already satisfied: pyyaml in /usr/lib/python3/dist-packages (from vllm) (5.4.1)
Collecting prometheus-fastapi-instrumentator>=7.0.0
Downloading prometheus_fastapi_instrumentator-7.0.2-py3-none-any.whl (18 kB)
Collecting fastapi[standard]!=0.113.*,!=0.114.0,>=0.107.0
Downloading fastapi-0.115.11-py3-none-any.whl (94 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 94.9/94.9 KB 1.9 MB/s eta 0:00:00
Collecting einops
Downloading einops-0.8.1-py3-none-any.whl (64 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.4/64.4 KB 7.2 MB/s eta 0:00:00
Collecting mistral_common[opencv]>=1.5.0
Downloading mistral_common-1.5.3-py3-none-any.whl (6.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.5/6.5 MB 4.4 MB/s eta 0:00:00
Collecting torchvision==0.20.1
Downloading torchvision-0.20.1-cp310-cp310-manylinux1_x86_64.whl (7.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.2/7.2 MB 5.3 MB/s eta 0:00:00
Collecting ray[adag]==2.40.0
Downloading ray-2.40.0-cp310-cp310-manylinux2014_x86_64.whl (66.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.8/66.8 MB 3.6 MB/s eta 0:00:00
Collecting filelock>=3.16.1
Downloading filelock-3.17.0-py3-none-any.whl (16 kB)
Collecting outlines==0.1.11
Downloading outlines-0.1.11-py3-none-any.whl (87 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 87.6/87.6 KB 3.2 MB/s eta 0:00:00
Collecting gguf==0.10.0
Downloading gguf-0.10.0-py3-none-any.whl (71 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 71.6/71.6 KB 4.5 MB/s eta 0:00:00
Collecting numpy<2.0.0
Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 4.5 MB/s eta 0:00:00
Collecting openai>=1.52.0
Downloading openai-1.65.2-py3-none-any.whl (473 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 473.2/473.2 KB 4.3 MB/s eta 0:00:00
Collecting torch==2.5.1
Downloading torch-2.5.1-cp310-cp310-manylinux1_x86_64.whl (906.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 906.4/906.4 MB 527.7 kB/s eta 0:00:00
Collecting psutil
Downloading psutil-7.0.0-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (277 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 278.0/278.0 KB 7.0 MB/s eta 0:00:00
Collecting lark==1.2.2
Downloading lark-1.2.2-py3-none-any.whl (111 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 111.0/111.0 KB 6.0 MB/s eta 0:00:00
Collecting transformers>=4.48.2
Downloading transformers-4.49.0-py3-none-any.whl (10.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.0/10.0 MB 3.6 MB/s eta 0:00:00
Collecting xgrammar==0.1.11
Downloading xgrammar-0.1.11-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (396 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 396.2/396.2 KB 6.2 MB/s eta 0:00:00
Collecting tokenizers>=0.19.1
Downloading tokenizers-0.21.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 4.3 MB/s eta 0:00:00
Collecting cloudpickle
Downloading cloudpickle-3.1.1-py3-none-any.whl (20 kB)
Collecting lm-format-enforcer<0.11,>=0.10.9
Downloading lm_format_enforcer-0.10.11-py3-none-any.whl (44 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.2/44.2 KB 2.2 MB/s eta 0:00:00
Requirement already satisfied: protobuf in /usr/local/lib/python3.10/dist-packages (from vllm) (5.28.3)
Collecting prometheus_client>=0.18.0
Downloading prometheus_client-0.21.1-py3-none-any.whl (54 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.7/54.7 KB 1.6 MB/s eta 0:00:00
Collecting py-cpuinfo
Downloading py_cpuinfo-9.0.0-py3-none-any.whl (22 kB)
Collecting tiktoken>=0.6.0
Downloading tiktoken-0.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 4.7 MB/s eta 0:00:00
Collecting sentencepiece
Downloading sentencepiece-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 4.5 MB/s eta 0:00:00
Requirement already satisfied: importlib_metadata in /usr/lib/python3/dist-packages (from vllm) (4.6.4)
Requirement already satisfied: pillow in /usr/lib/python3/dist-packages (from vllm) (9.0.1)
Collecting compressed-tensors==0.9.1
Downloading compressed_tensors-0.9.1-py3-none-any.whl (96 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.5/96.5 KB 3.0 MB/s eta 0:00:00
Collecting tqdm
Downloading tqdm-4.67.1-py3-none-any.whl (78 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.5/78.5 KB 8.3 MB/s eta 0:00:00
Collecting typing_extensions>=4.10
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Collecting astor
Downloading astor-0.8.1-py2.py3-none-any.whl (27 kB)
Collecting dill
Downloading dill-0.3.9-py3-none-any.whl (119 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 119.4/119.4 KB 7.9 MB/s eta 0:00:00
Collecting llvmlite<0.44,>=0.43.0dev0
Downloading llvmlite-0.43.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (43.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.9/43.9 MB 4.7 MB/s eta 0:00:00
Collecting referencing
Downloading referencing-0.36.2-py3-none-any.whl (26 kB)
Collecting diskcache
Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.5/45.5 KB 2.3 MB/s eta 0:00:00
Collecting nest_asyncio
Downloading nest_asyncio-1.6.0-py3-none-any.whl (5.2 kB)
Collecting pycountry
Downloading pycountry-24.6.1-py3-none-any.whl (6.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 5.3 MB/s eta 0:00:00
Collecting jinja2
Downloading jinja2-3.1.5-py3-none-any.whl (134 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.6/134.6 KB 8.4 MB/s eta 0:00:00
Collecting outlines_core==0.1.26
Downloading outlines_core-0.1.26-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (343 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 343.6/343.6 KB 5.9 MB/s eta 0:00:00
Collecting airportsdata
Downloading airportsdata-20250224-py3-none-any.whl (913 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 913.7/913.7 KB 6.5 MB/s eta 0:00:00
Collecting interegular
Downloading interegular-0.3.3-py37-none-any.whl (23 kB)
Collecting jsonschema
Downloading jsonschema-4.23.0-py3-none-any.whl (88 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 88.5/88.5 KB 4.9 MB/s eta 0:00:00
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from ray[adag]==2.40.0->vllm) (24.2)
Collecting frozenlist
Downloading frozenlist-1.5.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (241 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 241.9/241.9 KB 3.5 MB/s eta 0:00:00
Collecting aiosignal
Downloading aiosignal-1.3.2-py2.py3-none-any.whl (7.6 kB)
Collecting msgpack<2.0.0,>=1.0.0
Downloading msgpack-1.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (378 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 378.0/378.0 KB 4.3 MB/s eta 0:00:00
Requirement already satisfied: click>=7.0 in /usr/lib/python3/dist-packages (from ray[adag]==2.40.0->vllm) (8.0.3)
Collecting cupy-cuda12x
Downloading cupy_cuda12x-13.4.0-cp310-cp310-manylinux2014_x86_64.whl (104.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 104.6/104.6 MB 3.4 MB/s eta 0:00:00
Collecting nvidia-cuda-nvrtc-cu12==12.4.127
Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (24.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.6/24.6 MB 4.8 MB/s eta 0:00:00
Collecting fsspec
Downloading fsspec-2025.2.0-py3-none-any.whl (184 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 184.5/184.5 KB 2.6 MB/s eta 0:00:00
Collecting nvidia-cufft-cu12==11.2.1.3
Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl (211.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.5/211.5 MB 2.9 MB/s eta 0:00:00
Collecting nvidia-cudnn-cu12==9.1.0.70
Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 456.4 kB/s eta 0:00:00
Collecting nvidia-cusparse-cu12==12.3.1.170
Downloading nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl (207.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 207.5/207.5 MB 2.0 MB/s eta 0:00:00
Collecting nvidia-nvjitlink-cu12==12.4.127
Downloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (21.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 3.1 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu12==12.4.127
Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (883 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 883.7/883.7 KB 3.3 MB/s eta 0:00:00
Collecting triton==3.1.0
Downloading triton-3.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (209.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.5/209.5 MB 1.6 MB/s eta 0:00:00
Collecting nvidia-cublas-cu12==12.4.5.8
Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl (363.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 363.4/363.4 MB 1.1 MB/s eta 0:00:00
Collecting nvidia-curand-cu12==10.3.5.147
Downloading nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl (56.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.3/56.3 MB 2.5 MB/s eta 0:00:00
Collecting nvidia-nccl-cu12==2.21.5
Downloading nvidia_nccl_cu12-2.21.5-py3-none-manylinux2014_x86_64.whl (188.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 188.7/188.7 MB 1.6 MB/s eta 0:00:00
Collecting nvidia-nvtx-cu12==12.4.127
Downloading nvidia_nvtx_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (99 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 KB 1.6 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu12==11.6.1.9
Downloading nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl (127.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 127.9/127.9 MB 2.1 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu12==12.4.127
Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (13.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.8/13.8 MB 3.4 MB/s eta 0:00:00
Collecting networkx
Downloading networkx-3.4.2-py3-none-any.whl (1.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 3.6 MB/s eta 0:00:00
Collecting sympy==1.13.1
Downloading sympy-1.13.1-py3-none-any.whl (6.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.2/6.2 MB 2.4 MB/s eta 0:00:00
Collecting pybind11
Downloading pybind11-2.13.6-py3-none-any.whl (243 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 243.3/243.3 KB 2.0 MB/s eta 0:00:00
Collecting pytest
Downloading pytest-8.3.4-py3-none-any.whl (343 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 343.1/343.1 KB 3.4 MB/s eta 0:00:00
Collecting mpmath<1.4,>=1.1.0
Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 KB 1.7 MB/s eta 0:00:00
Collecting starlette<0.47.0,>=0.40.0
Downloading starlette-0.46.0-py3-none-any.whl (71 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.0/72.0 KB 9.5 MB/s eta 0:00:00
Collecting email-validator>=2.0.0
Downloading email_validator-2.2.0-py3-none-any.whl (33 kB)
Collecting httpx>=0.23.0
Downloading httpx-0.28.1-py3-none-any.whl (73 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 73.5/73.5 KB 2.7 MB/s eta 0:00:00
...........................................
Step 3: Using vLLM
Now that vLLM is installed, let’s see an example to understand its usage. In the below python code, we are going to load Llama-2 model
and generate a response efficiently using vLLM’s LLM engine.
Ubuntu-Server@ubuntu:~$ nano vllm_example.py from vllm import LLM, SamplingParams llm = LLM(model="meta-llama/Llama-2-7b-hf") sampling_params = SamplingParams(temperature=0.6, top_p=0.7, max_tokens=100) output = llm.generate("What is the capital of Germany?", sampling_params) for text in output: print(text.outputs[0].text)
Step 4: Uninstall vLLM
Once you are done using vLLM, you can also choose to uninstall it from your system by using pip uninstall vllm
command as shown below.
Ubuntu-Server@ubuntu:~$ pip uninstall vllm Found existing installation: vllm 0.7.3 Uninstalling vllm-0.7.3: Would remove: /home/Ubuntu-Server/.local/bin/vllm /home/Ubuntu-Server/.local/lib/python3.10/site-packages/vllm-0.7.3.dist-info/* /home/Ubuntu-Server/.local/lib/python3.10/site-packages/vllm/* Proceed (Y/n)? Y Successfully uninstalled vllm-0.7.3
Discover more from Ubuntu-Server.com
Subscribe to get the latest posts sent to your email.