Critical fix for recognition logic to prevent duplicate processing:
1. Problem Identified:
- _recognitionTimer was repeatedly calling decode() on same audio data
- Same recognition results were being sent multiple times to UI
- Caused redundant processing and potential performance issues
2. Solution Implemented:
- Add _lastRecognizedText state variable to track previous results
- Only send recognition results when text content actually changes
- Reset _lastRecognizedText when starting new recording session
3. Logic Changes:
- Enhanced recognition loop with duplicate detection:
A command-line utility for Dart development.
Usage: dart <command|dart-file> [arguments]
Global options:
-v, --verbose Show additional command output.
--version Print the Dart SDK version.
--enable-analytics Enable analytics.
--disable-analytics Disable analytics.
--suppress-analytics Disallow analytics for this `dart *` run without changing the analytics configuration.
-h, --help Print this usage information.
Available commands:
analyze Analyze Dart code in a directory.
compile Compile Dart to various formats.
create Create a new Dart project.
devtools Open DevTools (optionally connecting to an existing application).
doc Generate API documentation for Dart projects.
fix Apply automated fixes to Dart source code.
format Idiomatically format Dart source code.
info Show diagnostic information about the installed tooling.
pub Work with packages.
run Run a Dart program.
test Run tests for a project.
Run "dart help <command>" for more information about a command.
See https://dart.dev/tools/dart-tool for detailed documentation.
- Added debug logging for skipped duplicate results
- Reset state on startListening() to ensure clean slate
4. Benefits:
- Eliminates duplicate recognition results sent to UI
- Reduces unnecessary computation and network overhead
- Improves user experience with cleaner, non-repetitive updates
- Better resource utilization and battery life
This fix addresses the core issue where the recognition timer was
processing the same audio stream content repeatedly, ensuring each
unique recognition result is only sent once to the application.
|
||
|---|---|---|
| .dart_tool | ||
| .vscode | ||
| android | ||
| build/ios/XCBuildData/PIFCache | ||
| coverage | ||
| doc | ||
| example | ||
| ios | ||
| lib | ||
| scripts | ||
| test | ||
| CHANGELOG.md | ||
| LICENSE | ||
| README.md | ||
| SHERPA_ONNX_USAGE.md | ||
| debug_app.sh | ||
| pubspec.lock | ||
| pubspec.yaml | ||
| run_ios_app.sh | ||
| test_config.yaml | ||
| test_model_config.md | ||
| 使用说明.md | ||
| 权限问题解决指南.md | ||
| 模型文件路径修复方案.md | ||
| 测试用例说明.md | ||
| 测试运行结果.md | ||
| 简化使用指南.md | ||
| 音频文件测试报告.md | ||
README.md
YX ASR - Flutter Speech-to-Text Plugin
基于 sherpa_onnx 的 Flutter 语音识别插件,提供完全离线的实时语音转文字功能。
特性
- 🎤 实时语音识别: 边说边转换的实时转录功能
- 🔄 切换录音: 简单的开始/停止录音,带有视觉反馈
- 🌍 多语言支持: 支持中文、英文等多种语言
- 📱 跨平台: 支持 iOS 和 Android 平台
- 🎛️ 自定义UI: 灵活的录音按钮组件,支持自定义外观
- 🔒 权限管理: 自动处理麦克风权限申请
- ⚡ 完全离线: 基于 sherpa_onnx,无需网络连接
- 🎯 高精度识别: 使用先进的神经网络模型
- 🚀 低延迟: 实时处理,响应迅速
- 🔐 隐私保护: 语音数据不会上传到云端
安装
在您的 pubspec.yaml 文件中添加依赖:
dependencies:
yx_asr: ^1.0.0
然后运行:
flutter pub get
模型文件准备
由于使用 sherpa_onnx,您需要下载对应的模型文件:
-
中文模型 (推荐)
- 下载地址: https://github.com/k2-fsa/sherpa-onnx/releases/
- 模型名称:
sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 - 解压到:
assets/models/zh-cn/
-
英文模型
- 模型名称:
sherpa-onnx-streaming-zipformer-en-2023-02-21 - 解压到:
assets/models/en-us/
- 模型名称:
-
模型文件结构
assets/models/
├── zh-cn/
│ ├── encoder.onnx
│ ├── decoder.onnx
│ ├── joiner.onnx
│ └── tokens.txt
└── en-us/
├── encoder.onnx
├── decoder.onnx
├── joiner.onnx
└── tokens.txt
平台配置
Android
在 android/app/src/main/AndroidManifest.xml 中添加权限:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
iOS
在 ios/Runner/Info.plist 中添加权限:
<key>NSMicrophoneUsageDescription</key>
<string>此应用需要麦克风权限来录制您的语音进行识别</string>
注意:由于使用 sherpa_onnx 进行离线识别,不需要网络权限和语音识别权限。
快速开始
基本使用
import 'package:yx_asr/yx_asr.dart';
class MyApp extends StatefulWidget {
@override
_MyAppState createState() => _MyAppState();
}
class _MyAppState extends State<MyApp> {
final YxAsr _speechToText = YxAsr();
String _recognizedText = '';
bool _isListening = false;
@override
void initState() {
super.initState();
_initializeSpeechToText();
}
Future<void> _initializeSpeechToText() async {
// 使用中文模型初始化
bool initialized = await _speechToText.initializeWithModel('assets/models/zh-cn');
if (initialized) {
// 监听识别结果
_speechToText.onResult.listen((result) {
setState(() {
_recognizedText = result.recognizedWords;
});
});
// 监听错误
_speechToText.onError.listen((error) {
print('语音识别错误: ${error.errorMsg}');
});
// 监听状态变化
_speechToText.onListeningStatusChanged.listen((isListening) {
setState(() {
_isListening = isListening;
});
});
}
}
Future<void> _toggleRecording() async {
if (_isListening) {
await _speechToText.stopListening();
} else {
await _speechToText.startListening(
partialResults: true, // 启用部分结果
);
}
}
@override
Widget build(BuildContext context) {
return Scaffold(
appBar: AppBar(title: Text('语音识别')),
body: Column(
children: [
Text('识别结果: $_recognizedText'),
ElevatedButton(
onPressed: _toggleRecording,
child: Text(_isListening ? '停止' : '开始'),
),
],
),
);
}
}
Using the Recording Button Widget
The plugin includes a customizable RecordingButton widget:
import 'package:yx_asr/yx_asr.dart';
RecordingButton(
onResult: (result) {
print('Result: ${result.recognizedWords}');
},
onError: (error) {
print('Error: ${error.errorMsg}');
},
onListeningStatusChanged: (isListening) {
print('Listening: $isListening');
},
localeId: 'en-US',
partialResults: true,
size: 80.0,
tooltip: 'Tap to record',
)
API Reference
YxAsr Class
Methods
Future<bool> initialize()- Initialize the speech recognition serviceFuture<bool> isAvailable()- Check if speech recognition is availableFuture<bool> hasPermission()- Check if microphone permission is grantedFuture<bool> requestPermission()- Request microphone permissionFuture<void> startListening({String localeId, bool partialResults, bool onDevice})- Start listeningFuture<void> stopListening()- Stop listening and get final resultFuture<void> cancel()- Cancel current recognition sessionFuture<bool> get isListening- Check if currently listening
Streams
Stream<SpeechRecognitionResult> onResult- Stream of recognition resultsStream<SpeechRecognitionError> onError- Stream of recognition errorsStream<bool> onListeningStatusChanged- Stream of listening status changes
SpeechRecognitionResult
class SpeechRecognitionResult {
final String recognizedWords; // The recognized text
final bool finalResult; // Whether this is a final result
final double confidence; // Confidence level (0.0 to 1.0)
final List<String> alternatives; // Alternative recognition results
}
SpeechRecognitionError
class SpeechRecognitionError {
final SpeechRecognitionErrorType errorType; // Type of error
final String errorMsg; // Human-readable error message
final String? errorCode; // Platform-specific error code
}
RecordingButton Widget
Properties
onResult- Callback for recognition resultsonError- Callback for recognition errorsonListeningStatusChanged- Callback for status changeslocaleId- Language locale (default: 'en-US')partialResults- Enable partial results (default: true)onDevice- Use on-device recognition on iOS (default: false)size- Button size (default: 80.0)idleColor- Button color when not recordingrecordingColor- Button color when recordingdisabledColor- Button color when disabledenabled- Whether the button is enabled (default: true)tooltip- Tooltip text
Supported Languages
The plugin supports multiple languages including:
- English (en-US, en-GB)
- Chinese (zh-CN, zh-TW)
- Japanese (ja-JP)
- Korean (ko-KR)
- Spanish (es-ES)
- French (fr-FR)
- German (de-DE)
- Italian (it-IT)
Error Handling
The plugin provides comprehensive error handling through the SpeechRecognitionError class:
_speechToText.onError.listen((error) {
switch (error.errorType) {
case SpeechRecognitionErrorType.permissionDenied:
// Handle permission denied
break;
case SpeechRecognitionErrorType.network:
// Handle network errors
break;
case SpeechRecognitionErrorType.noSpeech:
// Handle no speech detected
break;
// ... handle other error types
}
});
Best Practices
- Always check permissions before starting recognition
- Handle errors gracefully to provide good user experience
- Use partial results for real-time feedback
- Stop listening when done to conserve battery
- Test on real devices as speech recognition doesn't work well on simulators
Example App
Check out the example/ directory for a comprehensive example app that demonstrates:
- Real-time speech recognition
- Multiple language support
- Error handling
- Recognition history
- Customizable settings
Contributing
Contributions are welcome! Please read our contributing guidelines and submit pull requests to our repository.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Troubleshooting
Common Issues
-
Permission Denied Error
- Ensure microphone permissions are added to platform manifests
- Call
requestPermission()before starting recognition
-
Speech Recognition Not Available
- Check if device supports speech recognition with
isAvailable() - Ensure Google app is installed and updated on Android
- Check if device supports speech recognition with
-
No Speech Detected
- Check microphone hardware
- Ensure app has microphone permission
- Try speaking louder or closer to the microphone
-
Network Errors
- Check internet connectivity
- Some platforms require network for speech recognition
Testing
- Speech recognition doesn't work well on simulators/emulators
- Always test on real devices
- Test in quiet environments for better accuracy
Support
For issues and feature requests, please use the GitHub issue tracker.