332 lines
8.7 KiB
Markdown
332 lines
8.7 KiB
Markdown
# YX ASR - Flutter Speech-to-Text Plugin
|
||
|
||
基于 sherpa_onnx 的 Flutter 语音识别插件,提供完全离线的实时语音转文字功能。
|
||
|
||
## 特性
|
||
|
||
- 🎤 **实时语音识别**: 边说边转换的实时转录功能
|
||
- 🔄 **切换录音**: 简单的开始/停止录音,带有视觉反馈
|
||
- 🌍 **多语言支持**: 支持中文、英文等多种语言
|
||
- 📱 **跨平台**: 支持 iOS 和 Android 平台
|
||
- 🎛️ **自定义UI**: 灵活的录音按钮组件,支持自定义外观
|
||
- 🔒 **权限管理**: 自动处理麦克风权限申请
|
||
- ⚡ **完全离线**: 基于 sherpa_onnx,无需网络连接
|
||
- 🎯 **高精度识别**: 使用先进的神经网络模型
|
||
- 🚀 **低延迟**: 实时处理,响应迅速
|
||
- 🔐 **隐私保护**: 语音数据不会上传到云端
|
||
|
||
## 安装
|
||
|
||
在您的 `pubspec.yaml` 文件中添加依赖:
|
||
|
||
```yaml
|
||
dependencies:
|
||
yx_asr: ^1.0.0
|
||
```
|
||
|
||
然后运行:
|
||
|
||
```bash
|
||
flutter pub get
|
||
```
|
||
|
||
## 模型文件准备
|
||
|
||
由于使用 sherpa_onnx,您需要下载对应的模型文件:
|
||
|
||
1. **中文模型** (推荐)
|
||
- 下载地址: https://github.com/k2-fsa/sherpa-onnx/releases/
|
||
- 模型名称: `sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20`
|
||
- 解压到: `assets/models/zh-cn/`
|
||
|
||
2. **英文模型**
|
||
- 模型名称: `sherpa-onnx-streaming-zipformer-en-2023-02-21`
|
||
- 解压到: `assets/models/en-us/`
|
||
|
||
3. **模型文件结构**
|
||
```
|
||
assets/models/
|
||
├── zh-cn/
|
||
│ ├── encoder.onnx
|
||
│ ├── decoder.onnx
|
||
│ ├── joiner.onnx
|
||
│ └── tokens.txt
|
||
└── en-us/
|
||
├── encoder.onnx
|
||
├── decoder.onnx
|
||
├── joiner.onnx
|
||
└── tokens.txt
|
||
```
|
||
|
||
## 平台配置
|
||
|
||
### Android
|
||
|
||
在 `android/app/src/main/AndroidManifest.xml` 中添加权限:
|
||
|
||
```xml
|
||
<uses-permission android:name="android.permission.RECORD_AUDIO" />
|
||
```
|
||
|
||
### iOS
|
||
|
||
在 `ios/Runner/Info.plist` 中添加权限:
|
||
|
||
```xml
|
||
<key>NSMicrophoneUsageDescription</key>
|
||
<string>此应用需要麦克风权限来录制您的语音进行识别</string>
|
||
```
|
||
|
||
注意:由于使用 sherpa_onnx 进行离线识别,不需要网络权限和语音识别权限。
|
||
|
||
## 快速开始
|
||
|
||
### 基本使用
|
||
|
||
```dart
|
||
import 'package:yx_asr/yx_asr.dart';
|
||
|
||
class MyApp extends StatefulWidget {
|
||
@override
|
||
_MyAppState createState() => _MyAppState();
|
||
}
|
||
|
||
class _MyAppState extends State<MyApp> {
|
||
final YxAsr _speechToText = YxAsr();
|
||
String _recognizedText = '';
|
||
bool _isListening = false;
|
||
|
||
@override
|
||
void initState() {
|
||
super.initState();
|
||
_initializeSpeechToText();
|
||
}
|
||
|
||
Future<void> _initializeSpeechToText() async {
|
||
// 使用中文模型初始化
|
||
bool initialized = await _speechToText.initializeWithModel('assets/models/zh-cn');
|
||
|
||
if (initialized) {
|
||
// 监听识别结果
|
||
_speechToText.onResult.listen((result) {
|
||
setState(() {
|
||
_recognizedText = result.recognizedWords;
|
||
});
|
||
});
|
||
|
||
// 监听错误
|
||
_speechToText.onError.listen((error) {
|
||
print('语音识别错误: ${error.errorMsg}');
|
||
});
|
||
|
||
// 监听状态变化
|
||
_speechToText.onListeningStatusChanged.listen((isListening) {
|
||
setState(() {
|
||
_isListening = isListening;
|
||
});
|
||
});
|
||
}
|
||
}
|
||
|
||
Future<void> _toggleRecording() async {
|
||
if (_isListening) {
|
||
await _speechToText.stopListening();
|
||
} else {
|
||
await _speechToText.startListening(
|
||
partialResults: true, // 启用部分结果
|
||
);
|
||
}
|
||
}
|
||
|
||
@override
|
||
Widget build(BuildContext context) {
|
||
return Scaffold(
|
||
appBar: AppBar(title: Text('语音识别')),
|
||
body: Column(
|
||
children: [
|
||
Text('识别结果: $_recognizedText'),
|
||
ElevatedButton(
|
||
onPressed: _toggleRecording,
|
||
child: Text(_isListening ? '停止' : '开始'),
|
||
),
|
||
],
|
||
),
|
||
);
|
||
}
|
||
}
|
||
```
|
||
|
||
### Using the Recording Button Widget
|
||
|
||
The plugin includes a customizable `RecordingButton` widget:
|
||
|
||
```dart
|
||
import 'package:yx_asr/yx_asr.dart';
|
||
|
||
RecordingButton(
|
||
onResult: (result) {
|
||
print('Result: ${result.recognizedWords}');
|
||
},
|
||
onError: (error) {
|
||
print('Error: ${error.errorMsg}');
|
||
},
|
||
onListeningStatusChanged: (isListening) {
|
||
print('Listening: $isListening');
|
||
},
|
||
localeId: 'en-US',
|
||
partialResults: true,
|
||
size: 80.0,
|
||
tooltip: 'Tap to record',
|
||
)
|
||
```
|
||
|
||
## API Reference
|
||
|
||
### YxAsr Class
|
||
|
||
#### Methods
|
||
|
||
- `Future<bool> initialize()` - Initialize the speech recognition service
|
||
- `Future<bool> isAvailable()` - Check if speech recognition is available
|
||
- `Future<bool> hasPermission()` - Check if microphone permission is granted
|
||
- `Future<bool> requestPermission()` - Request microphone permission
|
||
- `Future<void> startListening({String localeId, bool partialResults, bool onDevice})` - Start listening
|
||
- `Future<void> stopListening()` - Stop listening and get final result
|
||
- `Future<void> cancel()` - Cancel current recognition session
|
||
- `Future<bool> get isListening` - Check if currently listening
|
||
|
||
#### Streams
|
||
|
||
- `Stream<SpeechRecognitionResult> onResult` - Stream of recognition results
|
||
- `Stream<SpeechRecognitionError> onError` - Stream of recognition errors
|
||
- `Stream<bool> onListeningStatusChanged` - Stream of listening status changes
|
||
|
||
### SpeechRecognitionResult
|
||
|
||
```dart
|
||
class SpeechRecognitionResult {
|
||
final String recognizedWords; // The recognized text
|
||
final bool finalResult; // Whether this is a final result
|
||
final double confidence; // Confidence level (0.0 to 1.0)
|
||
final List<String> alternatives; // Alternative recognition results
|
||
}
|
||
```
|
||
|
||
### SpeechRecognitionError
|
||
|
||
```dart
|
||
class SpeechRecognitionError {
|
||
final SpeechRecognitionErrorType errorType; // Type of error
|
||
final String errorMsg; // Human-readable error message
|
||
final String? errorCode; // Platform-specific error code
|
||
}
|
||
```
|
||
|
||
### RecordingButton Widget
|
||
|
||
#### Properties
|
||
|
||
- `onResult` - Callback for recognition results
|
||
- `onError` - Callback for recognition errors
|
||
- `onListeningStatusChanged` - Callback for status changes
|
||
- `localeId` - Language locale (default: 'en-US')
|
||
- `partialResults` - Enable partial results (default: true)
|
||
- `onDevice` - Use on-device recognition on iOS (default: false)
|
||
- `size` - Button size (default: 80.0)
|
||
- `idleColor` - Button color when not recording
|
||
- `recordingColor` - Button color when recording
|
||
- `disabledColor` - Button color when disabled
|
||
- `enabled` - Whether the button is enabled (default: true)
|
||
- `tooltip` - Tooltip text
|
||
|
||
## Supported Languages
|
||
|
||
The plugin supports multiple languages including:
|
||
|
||
- English (en-US, en-GB)
|
||
- Chinese (zh-CN, zh-TW)
|
||
- Japanese (ja-JP)
|
||
- Korean (ko-KR)
|
||
- Spanish (es-ES)
|
||
- French (fr-FR)
|
||
- German (de-DE)
|
||
- Italian (it-IT)
|
||
|
||
## Error Handling
|
||
|
||
The plugin provides comprehensive error handling through the `SpeechRecognitionError` class:
|
||
|
||
```dart
|
||
_speechToText.onError.listen((error) {
|
||
switch (error.errorType) {
|
||
case SpeechRecognitionErrorType.permissionDenied:
|
||
// Handle permission denied
|
||
break;
|
||
case SpeechRecognitionErrorType.network:
|
||
// Handle network errors
|
||
break;
|
||
case SpeechRecognitionErrorType.noSpeech:
|
||
// Handle no speech detected
|
||
break;
|
||
// ... handle other error types
|
||
}
|
||
});
|
||
```
|
||
|
||
## Best Practices
|
||
|
||
1. **Always check permissions** before starting recognition
|
||
2. **Handle errors gracefully** to provide good user experience
|
||
3. **Use partial results** for real-time feedback
|
||
4. **Stop listening** when done to conserve battery
|
||
5. **Test on real devices** as speech recognition doesn't work well on simulators
|
||
|
||
## Example App
|
||
|
||
Check out the `example/` directory for a comprehensive example app that demonstrates:
|
||
|
||
- Real-time speech recognition
|
||
- Multiple language support
|
||
- Error handling
|
||
- Recognition history
|
||
- Customizable settings
|
||
|
||
## Contributing
|
||
|
||
Contributions are welcome! Please read our contributing guidelines and submit pull requests to our repository.
|
||
|
||
## License
|
||
|
||
This project is licensed under the MIT License - see the LICENSE file for details.
|
||
|
||
## Troubleshooting
|
||
|
||
### Common Issues
|
||
|
||
1. **Permission Denied Error**
|
||
- Ensure microphone permissions are added to platform manifests
|
||
- Call `requestPermission()` before starting recognition
|
||
|
||
2. **Speech Recognition Not Available**
|
||
- Check if device supports speech recognition with `isAvailable()`
|
||
- Ensure Google app is installed and updated on Android
|
||
|
||
3. **No Speech Detected**
|
||
- Check microphone hardware
|
||
- Ensure app has microphone permission
|
||
- Try speaking louder or closer to the microphone
|
||
|
||
4. **Network Errors**
|
||
- Check internet connectivity
|
||
- Some platforms require network for speech recognition
|
||
|
||
### Testing
|
||
|
||
- Speech recognition doesn't work well on simulators/emulators
|
||
- Always test on real devices
|
||
- Test in quiet environments for better accuracy
|
||
|
||
## Support
|
||
|
||
For issues and feature requests, please use the GitHub issue tracker.
|