Enter Model Details
Model Name
Provide a descriptive name for your model to identify it within your workspace.Source
Specify where the model will be fetched from. Choose from the following options:Public Sources
-
HuggingFace Model Hub
Provide the repository path in the formatcreator/model-slug
Example:meta-llama/Llama-3.2-3B-Instruct -
Public URL
Provide a direct, publicly accessible download link to the model
Getting your model path from HuggingFace
Getting your model path from HuggingFace
- Visit huggingface.co.
- Use the search bar to find the desired model. (e.g., “whisper-large”)
- Click on the model you want from the search results. (e.g., openai/whisper-large-v3-turbo)
- Copy the model path displayed at the top of the page (e.g., openai/whisper-large-v3-turbo) for use.
The model path on HuggingFace follows the format: creator/model-slug.
Cloud Storage
Cloud storage sources require authentication credentials (configured as Secrets in your workspace).
-
AWS S3
Enter the S3 bucket path (e.g.,s3://my-bucket/models/my-model) -
GCP GCS
Enter the Google Cloud Storage bucket path (e.g.,gs://my-bucket/models/my-model) -
Shakti Cloud S3
Provide the Shakti Cloud S3 path where your model is stored (e.g.,s3://my-bucket/models/my-model)
Model Class
Select the appropriate model class based on your model architecture (e.g.,LlamaForCausalLM for Llama-series models).

Optimizing Infrastructure
- Configure the infrastructure to optimize the model’s performance, such as selecting the appropriate compute resources and optimization techniques.
Configuration
- Select the desired quantization format: FP16 or AWQ (based on your performance and resource requirements)
- FP16 (Half-Precision): Offers higher precision and accuracy, but requires more GPU memory and compute power.
- AWQ (Activation-aware Weight Quantization): Reduces model size and memory usage with minimal impact on accuracy, making it suitable for resource-constrained environments.
- The optimization, model, and pipeline configurations are auto-filled based on the details provided earlier. You may modify them if required to suit your deployment needs.
- Finalize the model’s configuration by setting any additional parameters or preferences required for deployment.


