Yes, since we have similar gpus you could try the following to run it in a docker container on linux, taken from here and slightly modified:
#!/bin/bash
model=microsoft/phi-2
# share a volume with the Docker container to avoid downloading weights every run
volume=<path-to-your-data-directory>/data
docker run -e HSA_OVERRIDE_GFX_VERSION=10.3.0 -e PYTORCH_ROCM_ARCH="gfx1031" --device /dev/kfd --device /dev/dri --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.4-rocm --model-id $model
Note how the rocm version has a different tag and that you need to mount your gpu device into the container. The two environment variables are specific to my (any maybe yours also) gpu architecture. It will need a while to download though.
Im so looking forward to this. When i tried to use tmpfs / ramdisk, the transcoding would simply stop because there was no space left.