|
|
||
|---|---|---|
| .dart_tool | ||
| .vscode | ||
| coverage | ||
| doc | ||
| example | ||
| lib | ||
| scripts | ||
| test | ||
| .flutter-plugins-dependencies | ||
| .gitignore | ||
| CHANGELOG.md | ||
| LICENSE | ||
| README.md | ||
| SHERPA_ONNX_USAGE.md | ||
| debug_app.sh | ||
| pubspec.lock | ||
| pubspec.yaml | ||
| run_ios_app.sh | ||
| test_config.yaml | ||
| test_model_config.md | ||
| 使用说明.md | ||
| 权限问题解决指南.md | ||
| 模型文件路径修复方案.md | ||
| 测试用例说明.md | ||
| 测试运行结果.md | ||
| 简化使用指南.md | ||
| 音频文件测试报告.md | ||
README.md
YX ASR - Flutter Speech-to-Text Plugin
基于 sherpa_onnx 的 Flutter 语音识别插件,提供完全离线的实时语音转文字功能。
特性
- 🎤 实时语音识别: 边说边转换的实时转录功能
- 🔄 切换录音: 简单的开始/停止录音,带有视觉反馈
- 🌍 多语言支持: 支持中文、英文等多种语言
- 📱 跨平台: 支持 iOS 和 Android 平台
- 🎛️ 自定义UI: 灵活的录音按钮组件,支持自定义外观
- 🔒 权限管理: 自动处理麦克风权限申请
- ⚡ 完全离线: 基于 sherpa_onnx,无需网络连接
- 🎯 高精度识别: 使用先进的神经网络模型
- 🚀 低延迟: 实时处理,响应迅速
- 🔐 隐私保护: 语音数据不会上传到云端
安装
在您的 pubspec.yaml 文件中添加依赖:
dependencies:
yx_asr: ^1.0.0
然后运行:
flutter pub get
模型文件准备
由于使用 sherpa_onnx,您需要下载对应的模型文件:
-
中文模型 (推荐)
- 下载地址: https://github.com/k2-fsa/sherpa-onnx/releases/
- 模型名称:
sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 - 解压到:
assets/models/zh-cn/
-
英文模型
- 模型名称:
sherpa-onnx-streaming-zipformer-en-2023-02-21 - 解压到:
assets/models/en-us/
- 模型名称:
-
模型文件结构
assets/models/
├── zh-cn/
│ ├── encoder.onnx
│ ├── decoder.onnx
│ ├── joiner.onnx
│ └── tokens.txt
└── en-us/
├── encoder.onnx
├── decoder.onnx
├── joiner.onnx
└── tokens.txt
平台配置
Android
在 android/app/src/main/AndroidManifest.xml 中添加权限:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
iOS
在 ios/Runner/Info.plist 中添加权限:
<key>NSMicrophoneUsageDescription</key>
<string>此应用需要麦克风权限来录制您的语音进行识别</string>
注意:由于使用 sherpa_onnx 进行离线识别,不需要网络权限和语音识别权限。
快速开始
基本使用
import 'package:yx_asr/yx_asr.dart';
class MyApp extends StatefulWidget {
@override
_MyAppState createState() => _MyAppState();
}
class _MyAppState extends State<MyApp> {
final YxAsr _speechToText = YxAsr();
String _recognizedText = '';
bool _isListening = false;
@override
void initState() {
super.initState();
_initializeSpeechToText();
}
Future<void> _initializeSpeechToText() async {
// 使用中文模型初始化
bool initialized = await _speechToText.initializeWithModel('assets/models/zh-cn');
if (initialized) {
// 监听识别结果
_speechToText.onResult.listen((result) {
setState(() {
_recognizedText = result.recognizedWords;
});
});
// 监听错误
_speechToText.onError.listen((error) {
print('语音识别错误: ${error.errorMsg}');
});
// 监听状态变化
_speechToText.onListeningStatusChanged.listen((isListening) {
setState(() {
_isListening = isListening;
});
});
}
}
Future<void> _toggleRecording() async {
if (_isListening) {
await _speechToText.stopListening();
} else {
await _speechToText.startListening(
partialResults: true, // 启用部分结果
);
}
}
@override
Widget build(BuildContext context) {
return Scaffold(
appBar: AppBar(title: Text('语音识别')),
body: Column(
children: [
Text('识别结果: $_recognizedText'),
ElevatedButton(
onPressed: _toggleRecording,
child: Text(_isListening ? '停止' : '开始'),
),
],
),
);
}
}
Using the Recording Button Widget
The plugin includes a customizable RecordingButton widget:
import 'package:yx_asr/yx_asr.dart';
RecordingButton(
onResult: (result) {
print('Result: ${result.recognizedWords}');
},
onError: (error) {
print('Error: ${error.errorMsg}');
},
onListeningStatusChanged: (isListening) {
print('Listening: $isListening');
},
localeId: 'en-US',
partialResults: true,
size: 80.0,
tooltip: 'Tap to record',
)
API Reference
YxAsr Class
Methods
Future<bool> initialize()- Initialize the speech recognition serviceFuture<bool> isAvailable()- Check if speech recognition is availableFuture<bool> hasPermission()- Check if microphone permission is grantedFuture<bool> requestPermission()- Request microphone permissionFuture<void> startListening({String localeId, bool partialResults, bool onDevice})- Start listeningFuture<void> stopListening()- Stop listening and get final resultFuture<void> cancel()- Cancel current recognition sessionFuture<bool> get isListening- Check if currently listening
Streams
Stream<SpeechRecognitionResult> onResult- Stream of recognition resultsStream<SpeechRecognitionError> onError- Stream of recognition errorsStream<bool> onListeningStatusChanged- Stream of listening status changes
SpeechRecognitionResult
class SpeechRecognitionResult {
final String recognizedWords; // The recognized text
final bool finalResult; // Whether this is a final result
final double confidence; // Confidence level (0.0 to 1.0)
final List<String> alternatives; // Alternative recognition results
}
SpeechRecognitionError
class SpeechRecognitionError {
final SpeechRecognitionErrorType errorType; // Type of error
final String errorMsg; // Human-readable error message
final String? errorCode; // Platform-specific error code
}
RecordingButton Widget
Properties
onResult- Callback for recognition resultsonError- Callback for recognition errorsonListeningStatusChanged- Callback for status changeslocaleId- Language locale (default: 'en-US')partialResults- Enable partial results (default: true)onDevice- Use on-device recognition on iOS (default: false)size- Button size (default: 80.0)idleColor- Button color when not recordingrecordingColor- Button color when recordingdisabledColor- Button color when disabledenabled- Whether the button is enabled (default: true)tooltip- Tooltip text
Supported Languages
The plugin supports multiple languages including:
- English (en-US, en-GB)
- Chinese (zh-CN, zh-TW)
- Japanese (ja-JP)
- Korean (ko-KR)
- Spanish (es-ES)
- French (fr-FR)
- German (de-DE)
- Italian (it-IT)
Error Handling
The plugin provides comprehensive error handling through the SpeechRecognitionError class:
_speechToText.onError.listen((error) {
switch (error.errorType) {
case SpeechRecognitionErrorType.permissionDenied:
// Handle permission denied
break;
case SpeechRecognitionErrorType.network:
// Handle network errors
break;
case SpeechRecognitionErrorType.noSpeech:
// Handle no speech detected
break;
// ... handle other error types
}
});
Best Practices
- Always check permissions before starting recognition
- Handle errors gracefully to provide good user experience
- Use partial results for real-time feedback
- Stop listening when done to conserve battery
- Test on real devices as speech recognition doesn't work well on simulators
Example App
Check out the example/ directory for a comprehensive example app that demonstrates:
- Real-time speech recognition
- Multiple language support
- Error handling
- Recognition history
- Customizable settings
Contributing
Contributions are welcome! Please read our contributing guidelines and submit pull requests to our repository.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Troubleshooting
Common Issues
-
Permission Denied Error
- Ensure microphone permissions are added to platform manifests
- Call
requestPermission()before starting recognition
-
Speech Recognition Not Available
- Check if device supports speech recognition with
isAvailable() - Ensure Google app is installed and updated on Android
- Check if device supports speech recognition with
-
No Speech Detected
- Check microphone hardware
- Ensure app has microphone permission
- Try speaking louder or closer to the microphone
-
Network Errors
- Check internet connectivity
- Some platforms require network for speech recognition
Testing
- Speech recognition doesn't work well on simulators/emulators
- Always test on real devices
- Test in quiet environments for better accuracy
Support
For issues and feature requests, please use the GitHub issue tracker.