yx_speech_to_text_flutter/README.md

332 lines
8.7 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# YX ASR - Flutter Speech-to-Text Plugin
基于 sherpa_onnx 的 Flutter 语音识别插件,提供完全离线的实时语音转文字功能。
## 特性
- 🎤 **实时语音识别**: 边说边转换的实时转录功能
- 🔄 **切换录音**: 简单的开始/停止录音,带有视觉反馈
- 🌍 **多语言支持**: 支持中文、英文等多种语言
- 📱 **跨平台**: 支持 iOS 和 Android 平台
- 🎛️ **自定义UI**: 灵活的录音按钮组件,支持自定义外观
- 🔒 **权限管理**: 自动处理麦克风权限申请
-**完全离线**: 基于 sherpa_onnx无需网络连接
- 🎯 **高精度识别**: 使用先进的神经网络模型
- 🚀 **低延迟**: 实时处理,响应迅速
- 🔐 **隐私保护**: 语音数据不会上传到云端
## 安装
在您的 `pubspec.yaml` 文件中添加依赖:
```yaml
dependencies:
yx_asr: ^1.0.0
```
然后运行:
```bash
flutter pub get
```
## 模型文件准备
由于使用 sherpa_onnx您需要下载对应的模型文件
1. **中文模型** (推荐)
- 下载地址: https://github.com/k2-fsa/sherpa-onnx/releases/
- 模型名称: `sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20`
- 解压到: `assets/models/zh-cn/`
2. **英文模型**
- 模型名称: `sherpa-onnx-streaming-zipformer-en-2023-02-21`
- 解压到: `assets/models/en-us/`
3. **模型文件结构**
```
assets/models/
├── zh-cn/
│ ├── encoder.onnx
│ ├── decoder.onnx
│ ├── joiner.onnx
│ └── tokens.txt
└── en-us/
├── encoder.onnx
├── decoder.onnx
├── joiner.onnx
└── tokens.txt
```
## 平台配置
### Android
`android/app/src/main/AndroidManifest.xml` 中添加权限:
```xml
<uses-permission android:name="android.permission.RECORD_AUDIO" />
```
### iOS
`ios/Runner/Info.plist` 中添加权限:
```xml
<key>NSMicrophoneUsageDescription</key>
<string>此应用需要麦克风权限来录制您的语音进行识别</string>
```
注意:由于使用 sherpa_onnx 进行离线识别,不需要网络权限和语音识别权限。
## 快速开始
### 基本使用
```dart
import 'package:yx_asr/yx_asr.dart';
class MyApp extends StatefulWidget {
@override
_MyAppState createState() => _MyAppState();
}
class _MyAppState extends State<MyApp> {
final YxAsr _speechToText = YxAsr();
String _recognizedText = '';
bool _isListening = false;
@override
void initState() {
super.initState();
_initializeSpeechToText();
}
Future<void> _initializeSpeechToText() async {
// 使用中文模型初始化
bool initialized = await _speechToText.initializeWithModel('assets/models/zh-cn');
if (initialized) {
// 监听识别结果
_speechToText.onResult.listen((result) {
setState(() {
_recognizedText = result.recognizedWords;
});
});
// 监听错误
_speechToText.onError.listen((error) {
print('语音识别错误: ${error.errorMsg}');
});
// 监听状态变化
_speechToText.onListeningStatusChanged.listen((isListening) {
setState(() {
_isListening = isListening;
});
});
}
}
Future<void> _toggleRecording() async {
if (_isListening) {
await _speechToText.stopListening();
} else {
await _speechToText.startListening(
partialResults: true, // 启用部分结果
);
}
}
@override
Widget build(BuildContext context) {
return Scaffold(
appBar: AppBar(title: Text('语音识别')),
body: Column(
children: [
Text('识别结果: $_recognizedText'),
ElevatedButton(
onPressed: _toggleRecording,
child: Text(_isListening ? '停止' : '开始'),
),
],
),
);
}
}
```
### Using the Recording Button Widget
The plugin includes a customizable `RecordingButton` widget:
```dart
import 'package:yx_asr/yx_asr.dart';
RecordingButton(
onResult: (result) {
print('Result: ${result.recognizedWords}');
},
onError: (error) {
print('Error: ${error.errorMsg}');
},
onListeningStatusChanged: (isListening) {
print('Listening: $isListening');
},
localeId: 'en-US',
partialResults: true,
size: 80.0,
tooltip: 'Tap to record',
)
```
## API Reference
### YxAsr Class
#### Methods
- `Future<bool> initialize()` - Initialize the speech recognition service
- `Future<bool> isAvailable()` - Check if speech recognition is available
- `Future<bool> hasPermission()` - Check if microphone permission is granted
- `Future<bool> requestPermission()` - Request microphone permission
- `Future<void> startListening({String localeId, bool partialResults, bool onDevice})` - Start listening
- `Future<void> stopListening()` - Stop listening and get final result
- `Future<void> cancel()` - Cancel current recognition session
- `Future<bool> get isListening` - Check if currently listening
#### Streams
- `Stream<SpeechRecognitionResult> onResult` - Stream of recognition results
- `Stream<SpeechRecognitionError> onError` - Stream of recognition errors
- `Stream<bool> onListeningStatusChanged` - Stream of listening status changes
### SpeechRecognitionResult
```dart
class SpeechRecognitionResult {
final String recognizedWords; // The recognized text
final bool finalResult; // Whether this is a final result
final double confidence; // Confidence level (0.0 to 1.0)
final List<String> alternatives; // Alternative recognition results
}
```
### SpeechRecognitionError
```dart
class SpeechRecognitionError {
final SpeechRecognitionErrorType errorType; // Type of error
final String errorMsg; // Human-readable error message
final String? errorCode; // Platform-specific error code
}
```
### RecordingButton Widget
#### Properties
- `onResult` - Callback for recognition results
- `onError` - Callback for recognition errors
- `onListeningStatusChanged` - Callback for status changes
- `localeId` - Language locale (default: 'en-US')
- `partialResults` - Enable partial results (default: true)
- `onDevice` - Use on-device recognition on iOS (default: false)
- `size` - Button size (default: 80.0)
- `idleColor` - Button color when not recording
- `recordingColor` - Button color when recording
- `disabledColor` - Button color when disabled
- `enabled` - Whether the button is enabled (default: true)
- `tooltip` - Tooltip text
## Supported Languages
The plugin supports multiple languages including:
- English (en-US, en-GB)
- Chinese (zh-CN, zh-TW)
- Japanese (ja-JP)
- Korean (ko-KR)
- Spanish (es-ES)
- French (fr-FR)
- German (de-DE)
- Italian (it-IT)
## Error Handling
The plugin provides comprehensive error handling through the `SpeechRecognitionError` class:
```dart
_speechToText.onError.listen((error) {
switch (error.errorType) {
case SpeechRecognitionErrorType.permissionDenied:
// Handle permission denied
break;
case SpeechRecognitionErrorType.network:
// Handle network errors
break;
case SpeechRecognitionErrorType.noSpeech:
// Handle no speech detected
break;
// ... handle other error types
}
});
```
## Best Practices
1. **Always check permissions** before starting recognition
2. **Handle errors gracefully** to provide good user experience
3. **Use partial results** for real-time feedback
4. **Stop listening** when done to conserve battery
5. **Test on real devices** as speech recognition doesn't work well on simulators
## Example App
Check out the `example/` directory for a comprehensive example app that demonstrates:
- Real-time speech recognition
- Multiple language support
- Error handling
- Recognition history
- Customizable settings
## Contributing
Contributions are welcome! Please read our contributing guidelines and submit pull requests to our repository.
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Troubleshooting
### Common Issues
1. **Permission Denied Error**
- Ensure microphone permissions are added to platform manifests
- Call `requestPermission()` before starting recognition
2. **Speech Recognition Not Available**
- Check if device supports speech recognition with `isAvailable()`
- Ensure Google app is installed and updated on Android
3. **No Speech Detected**
- Check microphone hardware
- Ensure app has microphone permission
- Try speaking louder or closer to the microphone
4. **Network Errors**
- Check internet connectivity
- Some platforms require network for speech recognition
### Testing
- Speech recognition doesn't work well on simulators/emulators
- Always test on real devices
- Test in quiet environments for better accuracy
## Support
For issues and feature requests, please use the GitHub issue tracker.