Frequently Asked Questions¶
General¶
Q: What makes FRAMEWORM different?
A: FRAMEWORM is all-in-one. Training, tracking, search, deployment - everything integrated.
Q: Is FRAMEWORM production-ready?
A: Yes! Used in production at several companies.
Q: Does it support distributed training?
A: Yes, both DataParallel and DistributedDataParallel.
Installation¶
Q: Which Python versions are supported?
A: Python 3.8+ (3.10 recommended)
Q: Does it work on Windows?
A: Yes, but Linux is recommended for production.
Q: Can I use it without GPUs?
A: Yes, CPU training works fine (just slower).
Training¶
Q: How do I resume training?
A: Use --resume checkpoint.pt with CLI or:
checkpoint = torch.load('checkpoint.pt')
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
Q: Why is training slow?
A: Common causes: - CPU instead of GPU - Small batch size - Data loading bottleneck - No mixed precision
Deployment¶
Q: Which export format should I use?
A: - TorchScript - PyTorch native, C++ deployable - ONNX - Framework agnostic, TensorRT support - Quantized - 4x smaller, 2-3x faster
Q: How do I deploy to production?
A: See Production Deployment Guide
Troubleshooting¶
Q: CUDA out of memory
A: Reduce batch size or enable gradient accumulation.
Q: Model not converging
A: Try: - Lower learning rate - Different optimizer - Hyperparameter search
Q: Import errors
A: Reinstall: pip install --force-reinstall frameworm