Tutorial - NanoOWL
What you need
One of the following Jetson:
Jetson AGX Orin 64GB Jetson AGX Orin (32GB) Jetson Orin Nano Orin (8GB)
Running one of the following JetPack.5x
JetPack 5.1.2 (L4T r35.4.1) JetPack 5.1.1 (L4T r35.3.1) JetPack 5.1 (L4T r35.2.1)
Sufficient storage space (preferably with NVMe SSD).
7.2 GBfor container image
- Spaces for models
Clone and set up
git clone https://github.com/dusty-nv/jetson-containers cd jetson-containers sudo apt update; sudo apt install -y python3-pip pip3 install -r requirements.txt
How to start
autotag script to automatically pull or build a compatible container image.
cd jetson-containers ./run.sh $(./autotag nanoowl)
How to run the tree prediction (live camera) example
Ensure you have a camera device connected
If no video device is found, exit from the container and check if you can see a video device on the host side.
Launch the demo
cd examples/tree_demo python3 tree_demo.py ../../data/owl_image_encoder_patch32.engine
If it fails to find or load the TensorRT engine file, build the TensorRT engine for the OWL-ViT vision encoder on your Jetson device.
python3 -m nanoowl.build_image_encoder_engine \ data/owl_image_encoder_patch32.engine
Second, open your browser to
Type whatever prompt you like to see what works!
Here are some examples
[a face [a nose, an eye, a mouth]]
[a face (interested, yawning / bored)]