This project demonstrates how to use BentoML as a Queue Consumer, enabling push-based inference instead of the traditional pull-based inference (i.e., HTTP Request). Specifically, this project utilizes RabbitMQ as a queue and Amazon S3 for artifact storage.
The Bento service listens for messages published to RabbitMQ. When a message is published (containing an image URL to S3), the Bento service downloads the image, processes it using a CLIP model, and saves the output back to S3.
- You have installed Python 3.8+ and
pip. See the Python downloads page to learn more. - You have a basic understanding of key concepts in BentoML, such as Services. We recommend you read Quickstart first.
- (Optional) We recommend you create a virtual environment for dependency isolation for this project. See the Conda documentation or the Python documentation for details.
- You need Docker installed to set up RabbitMQ. See the Docker documentation for installation instructions.
- You need an AWS account to use Amazon S3. See the AWS documentation for more information.
docker run -d --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:management
git clone https://github.com/bentoml/BentoQueue.git
cd BentoQueue
pip install -r requirements.txt
- Log in to your AWS Management Console.
- Navigate to the S3 service.
- Click on "Create bucket".
- Enter a unique bucket name (e.g.,
bento-queue). - Choose the AWS region (e.g.,
us-west-1). - Leave the default settings for the rest of the options and click "Create bucket".
In your S3 bucket, you should have some images inside that can be downloaded for this project.
export RABBITMQ_URL="amqp://guest:guest@localhost:5672/"
export S3_ACCESS_KEY=<your-s3-access-key>
export S3_SECRET_KEY=<your-s3-secret-key>
We have defined a BentoML Service in service.py. Run bentoml serve in your project directory to start the Service.
$ bentoml serve .
2024-01-08T09:07:28+0000 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "service:CLIPService" can be accessed at http://localhost:3000/metrics.
2024-01-08T09:07:28+0000 [INFO] [cli] Starting production HTTP BentoServer from "service:CLIPService" listening on http://localhost:3000 (Press CTRL+C to quit)
Model clip loaded device: cudapython producer.py
You will be prompted to input a mesage to the queue, input the following and replace the images link in your S3 bucket respectively.
[{"key": "images/my_image.png"}]
After the Service is ready, you can deploy the application to BentoCloud for better management and scalability. Sign up if you haven't got a BentoCloud account.
Make sure you have logged in to BentoCloud, then run the following command to deploy it.
bentoml deploy .Once the application is up and running on BentoCloud, you can access it via the exposed URL.
Note: For custom deployment in your own infrastructure, use BentoML to generate an OCI-compliant image.
