Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable qwen2vl video #2756

Open
wants to merge 54 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
18c9f06
WIP video support
mfarre Nov 11, 2024
7c67939
router changes
mfarre Nov 13, 2024
5ced960
adopting video url
mfarre Nov 13, 2024
05464d2
connecting video to qwen2
mfarre Nov 14, 2024
c7c2fda
fix
mfarre Nov 14, 2024
b9c8152
downloading videos
mfarre Nov 14, 2024
464609f
fix
mfarre Nov 14, 2024
a25c3ec
refactoring
mfarre Nov 14, 2024
3c07391
fix
mfarre Nov 14, 2024
b2c5575
feat: support video input chunks and enable qwen2 vl to process video
drbh Nov 18, 2024
83a7f18
fix: add protobuf update and mp4parse dep
drbh Nov 18, 2024
322165d
fix: remove unused deps and imports
drbh Nov 18, 2024
e65ead1
moving video sampling and resize to validation. downstream we receive…
mfarre Nov 22, 2024
36e095b
flatten frames to data block when needed
mfarre Nov 25, 2024
bc5e202
fix: adjust video process, reduce to 1 fps and adjust tensor shape
drbh Nov 25, 2024
1afaa69
fix: adjust deps after rebase
drbh Nov 25, 2024
16007b6
feat: adjust impure shell deps and autodocs workflow
drbh Nov 26, 2024
39fac7e
fix: include more deps for ffmpeg as docs suggest
drbh Nov 26, 2024
b508b10
fix: add ffmpeg deps to test build
drbh Nov 26, 2024
4a3a724
fix: debug ffmpeg install in tests workflow
drbh Nov 26, 2024
ac7483c
fix: debug ffmpeg deps in tests II
drbh Nov 26, 2024
137f3bb
fix: adjust dependencies and bump pip along with python
drbh Nov 26, 2024
daf83a9
fix: adjust pkg config in test
drbh Nov 26, 2024
d5cc670
fix: ensure pip is installed after installing deps in test workflow
drbh Nov 26, 2024
4a76e8b
fix: add libavfilter dep to test
drbh Nov 26, 2024
f0c3841
fix: add libavdevice dep to tests workflow
drbh Nov 26, 2024
96968a0
fix: add ffmpeg overlay and enable build
drbh Nov 27, 2024
167c6f0
fix: include ffmpeg deps in autodocs workflow
drbh Nov 27, 2024
98392a7
Cleanup impure Nix shell
danieldk Nov 27, 2024
05004a6
Make the pure build work
danieldk Nov 27, 2024
063104c
Fix test devshell
danieldk Nov 27, 2024
2dc078a
fix: bump deps in other dockerfiles
drbh Nov 27, 2024
50b5399
fix: add ffmpeg to final layer of container
drbh Nov 27, 2024
b5b2184
fix: include usr lib in ld path
drbh Nov 28, 2024
75ab887
fix: copy shared libraries from builder
drbh Dec 2, 2024
cbf1d98
installing ssl requirements prior to rust building stage
mfarre Dec 4, 2024
af77a0c
fixing ssl issue
mfarre Dec 11, 2024
19e1c8d
working version
mfarre Dec 11, 2024
db97d97
cleanup prints
mfarre Dec 11, 2024
71ed75a
fix: pre commit and clippy lints
drbh Dec 12, 2024
e2b75a5
fix: resolve rebase issues and add test
drbh Dec 12, 2024
1d6bf24
fix: remove unnecessary cast
drbh Dec 12, 2024
2ae152a
fix: update all vlm forward args, pass shared libraries to final laye…
drbh Dec 12, 2024
5c7bc91
fix: adjust batch_tokenized_inputs output in mllama
drbh Dec 13, 2024
bb00fb3
fix: update lints after rebase
drbh Dec 13, 2024
91ed362
fix: update trtllm dockefile after rebase
drbh Dec 13, 2024
5322abd
fix: adjust whitespace lint
drbh Dec 13, 2024
b4da6ad
fix: feature flag video and remove from non cuda dockerfiles
drbh Dec 13, 2024
27f758d
fix: make ffmpeg-next dep optional with feature
drbh Dec 13, 2024
4f42d0c
fix: include the video feature in cargo chef command
drbh Dec 13, 2024
dcc1194
fix: adjust trtllm looper for video chunk enum
drbh Dec 16, 2024
b27749e
fix: small refactor and cleanups
drbh Jan 3, 2025
78cd756
fix: improve video processing and update unsupported paths
drbh Jan 16, 2025
17192c9
fix: remove test debug params
drbh Jan 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/autodocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
- name: Install Protocol Buffers compiler
run: |
sudo apt-get update
sudo apt-get install -y protobuf-compiler libprotobuf-dev
sudo apt-get install -y protobuf-compiler libprotobuf-dev clang libavcodec-dev libavfilter-dev libavdevice-dev libavformat-dev libavutil-dev pkg-config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No dev packages please (we shouldn't need them most likely).

Why is clang in there?


- name: Install Launcher
id: install-launcher
Expand Down
4 changes: 3 additions & 1 deletion .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,9 @@ jobs:
- name: Install
run: |
sudo apt update
sudo apt install python3.11-dev -y
sudo apt install python3.11-dev python3.11-venv python3-pip clang libavcodec-dev libavfilter-dev libavdevice-dev libavformat-dev libavutil-dev pkg-config -y
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/lib/x86_64-linux-gnu/pkgconfig
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not env shenanigans

python -m pip install --upgrade pip
make install-cpu
- name: Run server tests
run: |
Expand Down
89 changes: 88 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

73 changes: 49 additions & 24 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,28 @@ FROM chef AS builder

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
python3.11-dev

RUN apt-get update && apt-get install -y \
ffmpeg \
libavcodec-dev \
libavfilter-dev \
libavdevice-dev \
libavformat-dev \
libavutil-dev \
libswscale-dev \
pkg-config \
libclang-dev \
clang \
&& rm -rf /var/lib/apt/lists/*

RUN PROTOC_ZIP=protoc-21.12-linux-x86_64.zip && \
curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v21.12/$PROTOC_ZIP && \
unzip -o $PROTOC_ZIP -d /usr/local bin/protoc && \
unzip -o $PROTOC_ZIP -d /usr/local 'include/*' && \
rm -f $PROTOC_ZIP

COPY --from=planner /usr/src/recipe.json recipe.json
RUN cargo chef cook --profile release-opt --recipe-path recipe.json
RUN cargo chef cook --profile release-opt --features video --recipe-path recipe.json

ARG GIT_SHA
ARG DOCKER_LABEL
Expand All @@ -40,7 +54,7 @@ COPY benchmark benchmark
COPY router router
COPY backends backends
COPY launcher launcher
RUN cargo build --profile release-opt --frozen
RUN cargo build --profile release-opt --frozen --features video

# Python builder
# Adapted from: https://github.com/pytorch/pytorch/blob/master/Dockerfile
Expand All @@ -61,18 +75,18 @@ ARG TARGETPLATFORM
ENV PATH /opt/conda/bin:$PATH

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
build-essential \
ca-certificates \
ccache \
curl \
git && \
rm -rf /var/lib/apt/lists/*
build-essential \
ca-certificates \
ccache \
curl \
git && \
rm -rf /var/lib/apt/lists/*

# Install conda
# translating Docker's TARGETPLATFORM into mamba arches
RUN case ${TARGETPLATFORM} in \
"linux/arm64") MAMBA_ARCH=aarch64 ;; \
*) MAMBA_ARCH=x86_64 ;; \
"linux/arm64") MAMBA_ARCH=aarch64 ;; \
*) MAMBA_ARCH=x86_64 ;; \
esac && \
curl -fsSL -v -o ~/mambaforge.sh -O "https://github.com/conda-forge/miniforge/releases/download/${MAMBA_VERSION}/Mambaforge-${MAMBA_VERSION}-Linux-${MAMBA_ARCH}.sh"
RUN chmod +x ~/mambaforge.sh && \
Expand All @@ -82,21 +96,24 @@ RUN chmod +x ~/mambaforge.sh && \
# Install pytorch
# On arm64 we exit with an error code
RUN case ${TARGETPLATFORM} in \
"linux/arm64") exit 1 ;; \
*) /opt/conda/bin/conda update -y conda && \
/opt/conda/bin/conda install -c "${INSTALL_CHANNEL}" -c "${CUDA_CHANNEL}" -y "python=${PYTHON_VERSION}" "pytorch=$PYTORCH_VERSION" "pytorch-cuda=$(echo $CUDA_VERSION | cut -d'.' -f 1-2)" ;; \
"linux/arm64") exit 1 ;; \
*) /opt/conda/bin/conda update -y conda && \
/opt/conda/bin/conda install -c "${INSTALL_CHANNEL}" -c "${CUDA_CHANNEL}" -y "python=${PYTHON_VERSION}" "pytorch=$PYTORCH_VERSION" "pytorch-cuda=$(echo $CUDA_VERSION | cut -d'.' -f 1-2)" "openssl>=3.3.0" ;; \
esac && \
/opt/conda/bin/conda clean -ya

RUN /opt/conda/bin/conda install -y pyOpenSSL


# CUDA kernels builder image
FROM pytorch-install AS kernel-builder

ARG MAX_JOBS=8
ENV TORCH_CUDA_ARCH_LIST="8.0;8.6;9.0+PTX"

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
ninja-build cmake \
&& rm -rf /var/lib/apt/lists/*
ninja-build cmake \
&& rm -rf /var/lib/apt/lists/*

# Build Flash Attention CUDA kernels
FROM kernel-builder AS flash-att-builder
Expand Down Expand Up @@ -188,12 +205,15 @@ ENV HF_HOME=/data \
WORKDIR /usr/src

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
libssl-dev \
ca-certificates \
make \
curl \
git \
&& rm -rf /var/lib/apt/lists/*
libssl-dev \
ca-certificates \
make \
curl \
git \
&& rm -rf /var/lib/apt/lists/*

# Add ffmpeg libraries to the path
ENV LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"

# Copy conda with PyTorch installed
COPY --from=pytorch-install /opt/conda /opt/conda
Expand Down Expand Up @@ -239,6 +259,8 @@ RUN cd server && \
ENV LD_PRELOAD=/opt/conda/lib/python3.11/site-packages/nvidia/nccl/lib/libnccl.so.2
# Required to find libpython within the rust binaries
ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/conda/lib/"
ENV LD_PRELOAD="/opt/conda/lib/libcrypto.so.3"

# This is needed because exl2 tries to load flash-attn
# And fails with our builds.
ENV EXLLAMA_NO_FLASH_ATTN=1
Expand All @@ -247,9 +269,9 @@ ENV EXLLAMA_NO_FLASH_ATTN=1
# The binaries change on every build given we burn the SHA into them
# The deps change less often.
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
build-essential \
g++ \
&& rm -rf /var/lib/apt/lists/*
build-essential \
g++ \
&& rm -rf /var/lib/apt/lists/*

# Install benchmarker
COPY --from=builder /usr/src/target/release-opt/text-generation-benchmark /usr/local/bin/text-generation-benchmark
Expand All @@ -258,6 +280,9 @@ COPY --from=builder /usr/src/target/release-opt/text-generation-router /usr/loca
# Install launcher
COPY --from=builder /usr/src/target/release-opt/text-generation-launcher /usr/local/bin/text-generation-launcher

# Copy the ffmpeg libraries
COPY --from=builder /usr/lib/x86_64-linux-gnu/* /usr/lib/x86_64-linux-gnu-copy/
ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu-copy"

# AWS Sagemaker compatible image
FROM base AS sagemaker
Expand Down
13 changes: 12 additions & 1 deletion backends/client/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ use thiserror::Error;
use tonic::transport;
use tonic::Status;

pub use v3::{Chunk, Image, Input, InputChunk};
pub use v3::{Chunk, Image, Input, InputChunk, Video};

#[async_trait]
pub trait Health {
Expand Down Expand Up @@ -79,6 +79,17 @@ impl ChunksToString for Vec<InputChunk> {
let encoded = STANDARD.encode(data);
output.push_str(&format!("![](data:{};base64,{})", mimetype, encoded))
}
Some(Chunk::Video(Video {
data,
mimetype,
width,
height: _,
frames: _,
})) => {
//
// TODO: do not support serialization of video data
unimplemented!("Video tokens are not supported for this model configuration")
}
// We don't create empty chunks, so this should be unreachable.
None => unreachable!("Chunks should never be empty"),
});
Expand Down
2 changes: 1 addition & 1 deletion backends/client/src/v3/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,6 @@ pub use client::Client;
pub use pb::generate::v3::{
input_chunk::Chunk, Batch, CachedBatch, FinishReason, GeneratedText, Generation, GrammarType,
HealthResponse, Image, InfoResponse, Input, InputChunk, NextTokenChooserParameters, Request,
StoppingCriteriaParameters, Tokens,
StoppingCriteriaParameters, Tokens, Video,
};
pub use sharded_client::ShardedClient;
1 change: 1 addition & 0 deletions backends/trtllm/src/looper.rs
Original file line number Diff line number Diff line change
Expand Up @@ -301,6 +301,7 @@ impl TensorRtLlmBackendV2 {
1 => match request.inputs.first().expect("Single item-chunk") {
Chunk::Text(_) => Ok(()),
Chunk::Image(_) => Err(ValidationError(UnsupportedModality("image"))),
Chunk::Video(_) => Err(ValidationError(UnsupportedModality("video"))),
},
}
}
Expand Down
2 changes: 1 addition & 1 deletion backends/v3/src/client/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ pub use grpc_client::Client;
pub use pb::generate::v3::{
input_chunk::Chunk, Batch, CachedBatch, FinishReason, GeneratedText, Generation, GrammarType,
HealthResponse, Image, InfoResponse, Input, InputChunk, NextTokenChooserParameters, Request,
StoppingCriteriaParameters,
StoppingCriteriaParameters, Video,
};
pub use sharded_client::ShardedClient;

Expand Down
7 changes: 7 additions & 0 deletions backends/v3/src/queue.rs
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,13 @@ impl State {
data: image.data,
mimetype: image.mimetype,
}),
Chunk::Video(video) => client::Chunk::Video(client::Video {
data: video.data,
mimetype: video.mimetype,
width: video.width,
height: video.height,
frames: video.num_frames,
}),
}),
})
.collect(),
Expand Down
Loading
Loading