Tutorial - NanoOWL
Let's run NanoOWL, OWL-ViT optimized to run real-time on Jetson with NVIDIA TensorRT.
What you need
-
One of the following Jetson:
Jetson AGX Orin 64GB Jetson AGX Orin (32GB) Jetson Orin Nano Orin (8GB)
-
Running one of the following JetPack.5x
JetPack 5.1.2 (L4T r35.4.1) JetPack 5.1.1 (L4T r35.3.1) JetPack 5.1 (L4T r35.2.1)
-
Sufficient storage space (preferably with NVMe SSD).
7.2 GB
for container image- Spaces for models
Clone and set up jetson-containers
git clone https://github.com/dusty-nv/jetson-containers
cd jetson-containers
sudo apt update; sudo apt install -y python3-pip
pip3 install -r requirements.txt
How to start
Use run.sh
and autotag
script to automatically pull or build a compatible container image.
cd jetson-containers
./run.sh $(./autotag nanoowl)
How to run the tree prediction (live camera) example
-
Ensure you have a camera device connected
ls /dev/video*
If no video device is found, exit from the container and check if you can see a video device on the host side.
-
Launch the demo
cd examples/tree_demo python3 tree_demo.py ../../data/owl_image_encoder_patch32.engine
Info
If it fails to find or load the TensorRT engine file, build the TensorRT engine for the OWL-ViT vision encoder on your Jetson device.
python3 -m nanoowl.build_image_encoder_engine \ data/owl_image_encoder_patch32.engine
-
Second, open your browser to
http://<ip address>:7860
-
Type whatever prompt you like to see what works!
Here are some examples
- Example:
[a face [a nose, an eye, a mouth]]
- Example:
[a face (interested, yawning / bored)]
- Example:
(indoors, outdoors)
- Example: