Make whisper an optional extra with faster-whisper by default#1877
Make whisper an optional extra with faster-whisper by default#1877Dreamsorcerer wants to merge 6 commits intodevfrom
Conversation
|
TTS seems to work pretty well with faster-whisper anyway. |
Greptile SummaryThis PR makes
Confidence Score: 4/5Safe to merge after fixing the misleading UserWarning that fires for all default users. One P1 finding: the UserWarning implies a degraded fallback state for every default install, which will confuse users. The missing version constraint on faster-whisper (P2) could also cause a runtime TypeError with older versions. Both are straightforward one-line fixes. dimos/stream/audio/stt/node_whisper.py (misleading warning), pyproject.toml (missing version constraint) Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[import WhisperNode] --> B{try: import whisper}
B -- success --> C[_USE_FASTER_WHISPER = False\nopenai-whisper backend]
B -- ImportError --> D{try: from faster_whisper\nimport WhisperModel}
D -- success --> E[UserWarning fired\n_USE_FASTER_WHISPER = True\nfaster-whisper backend]
D -- ImportError --> F[Raise ImportError\nNo backend found]
C --> G[WhisperNode.__init__]
E --> G
G --> H{_USE_FASTER_WHISPER?}
H -- True --> I[pop fp16 → compute_type\nWhisperModel device=auto]
H -- False --> J[whisper.load_model]
I --> K[transcribe → segments iterator\njoin seg.text]
J --> L[transcribe → dict\nresult text]
Reviews (1): Last reviewed commit: "Add warning" | Re-trigger Greptile |
| warnings.warn( | ||
| "openai-whisper not installed, falling back to faster-whisper. " | ||
| "Install openai-whisper for the full backend: pip install dimos[whisper]", | ||
| stacklevel=2, | ||
| ) | ||
| _USE_FASTER_WHISPER = True |
There was a problem hiding this comment.
Warning fires for all default users
faster-whisper is now bundled in the agents extra, meaning it is the intended default backend. Every user who installs the standard package will have only faster-whisper available, so this UserWarning fires for 100% of default users on every first import. The message "falling back to faster-whisper" implies a degraded or misconfigured setup, which will confuse users who have done nothing wrong.
Consider removing the warning entirely (faster-whisper is the intended default, not a fallback), or downgrade it to logging.debug, or reword it so it only surfaces if the user somehow expected openai-whisper:
| warnings.warn( | |
| "openai-whisper not installed, falling back to faster-whisper. " | |
| "Install openai-whisper for the full backend: pip install dimos[whisper]", | |
| stacklevel=2, | |
| ) | |
| _USE_FASTER_WHISPER = True | |
| warnings.warn( | |
| "openai-whisper is not installed; using faster-whisper. " | |
| "Install openai-whisper via `pip install dimos[whisper]` to use it instead.", | |
| stacklevel=2, | |
| ) |
Or simply omit the warning since this is the documented default path.
| if _USE_FASTER_WHISPER: | ||
| fp16 = modelopts.pop("fp16", False) | ||
| compute_type = "float16" if fp16 else "int8" | ||
| self.modelopts = modelopts | ||
| self.model = WhisperModel(model, device="auto", compute_type=compute_type) |
There was a problem hiding this comment.
Caller's dict mutated by
.pop()
modelopts.pop("fp16", False) mutates the dict that was passed in. If the caller stored a reference to their options dict and inspects it after construction, the fp16 key will be silently gone. A shallow copy before popping avoids this:
| if _USE_FASTER_WHISPER: | |
| fp16 = modelopts.pop("fp16", False) | |
| compute_type = "float16" if fp16 else "int8" | |
| self.modelopts = modelopts | |
| self.model = WhisperModel(model, device="auto", compute_type=compute_type) | |
| opts = dict(modelopts) | |
| fp16 = opts.pop("fp16", False) | |
| compute_type = "float16" if fp16 else "int8" | |
| self.modelopts = opts |
| "openai", | ||
| "openai-whisper", | ||
| "sounddevice", | ||
| "faster-whisper", |
There was a problem hiding this comment.
No version constraint on
faster-whisper
faster-whisper is listed without a minimum version. The device="auto" argument passed to WhisperModel (line 71 of node_whisper.py) was introduced in faster-whisper>=1.0.0. Without a lower bound, users on older versions will get a runtime TypeError. Adding a floor constraint prevents this:
| "faster-whisper", | |
| "faster-whisper>=1.0.0", |
| ] | ||
|
|
||
| whisper = [ | ||
| "dimos[agents]", |
There was a problem hiding this comment.
Why include dimos[agents] here?
There was a problem hiding this comment.
Does it make sense to install without agents? I was just assuming that we'd want all the other dependencies here.
Problem
Whisper requires downloading a 150MB model and depends on torch (with GBs of CUDA downloads).
Solution
Provide faster-whisper by default (2MB) and use as a fallback when whisper is not available.
This avoids the 150MB download, and means we are one step closer to not depending on torch for a base install.
Breaking Changes
Users need to request
dimos[whisper]now for full whisper feature.Test