A post by Kailas Goliwadekar In the recent past, I’ve been working on AI/ML with PowerFlex as SDS. My objective was to build a cloud-native artificial intelligence platform made up […]
In the recent past, I’ve been working on AI/ML with PowerFlex as SDS.
My objective was to build a cloud-native artificial intelligence platform made up of Red Hat OpenShift cluster, NVIDIA GPU’s and PowerFlex. So initially, I built PowerFlex 4.0 platform of 4 SDS. On separate PowerEdge nodes, I built an OpenShift BareMetal cluster with 3 master nodes and 4 worker nodes. The entire process of OpenShift deployment was carried out with assisted installer.
Then PowerFlex CSI is deployed on the OpenShift worker nodes that enables the pods to connect with PowerFlex storage.
The logical architecture of OpenShift on PowerFlex is showcased in below figure.
To carry out speech recognition services from NVIDIA, I had to first install GPU operator from OpenShift console. The NVIDIA GPU Operator makes the underlying GPUs of a compute node available to containerized workloads.
A prerequisite for running the GPU operator is the Node Feature Discovery (NFD) Operator, which detects hardware features and system configuration at a node level. After installing the NFD Operator and creating a NodeFeatureDiscovery instance, we can start with installing the NVIDIA GPU Operator and creating an instance of ClusterPolicy.
To deploy the Riva API I performed the following steps on my OpenShift cluster.
In the riva-api folder, I have chosen asr,nlp, and tts to true or false as needed. Also, I changed the service.type from LoadBalancer to ClusterIP. This directly exposes the service only to other services within the cluster.
Enable the cluster to run containers needing NVIDIA GPUs using the nvidia device plugin
The Helm chart runs two containers in order: a riva-model-init container that downloads and deploys the models, followed by a riva-speech-api container to start the speech service API. Depending on the number of models, the initial model deployment could take an hour or more. To monitor the deployment, use kubectl to describe the riva-api pod and to watch the container logs.
Modify the traefik/values.yaml file. Change service.type from LoadBalancer to ClusterIP. This exposes the service on a cluster-internal IP. Now Deploy the modified traefik Helm chart.
helm install traefik traefik/
An IngressRoute enables the Traefik load balancer to recognize incoming requests and distribute them across multiple riva-api services. When you deployed the traefik Helm chart above, Kubernetes automatically created a local DNS entry for that service: traefik.default.svc.cluster.local. The IngressRoute definition below matches these DNS entries and directs requests to the riva-api service. Create the following riva-ingress.yaml file:
I0512 13:00:14.664886 47228 riva_streaming_asr_client.cc:150] Using Insecure Server Credentials
Loading eval dataset…
filename: /opt/riva/wav/en-US_sample.wav
Done loading 1 files
what
what
what is
what is
what is
what is now
what is natural
what is natural
what is natural language
what is natural language
what is natural language
what is natural language
what is natural language Processing
what is natural language Processing
what is natural language Processing
what is natural language Processing
what is natural language Processing
what is language Processing
what is language Processing
What is Natural Language Processing?
———————————————————–
File: /opt/riva/wav/en-US_sample.wav
Final transcripts:
0 : What is Natural Language Processing?
Timestamps:
Word Start (ms) End (ms)
What 840 880
is 1160 1200
Natural 1800 2080
Language 2200 2520
Processing? 2720 3200
Audio processed: 4 sec.
———————————————————–
Not printing latency statistics because the client is run without the –simulate_realtime option and/or the number of requests sent is not equal to number of requests received. To get latency statistics, run with –simulate_realtime and set the –chunk_duration_ms to be the same as the server chunk duration
Run time: 0.1486 sec.
Total audio processed: 4.152 sec.
Throughput: 27.9407 RTFX
I have published a short video of all other demo’s for Riva services. Check it out !