flutter 语音转文字 工具包
Go to file
Max 75080d0c0d Fix duplicate recognition issue in _recognitionTimer
Critical fix for recognition logic to prevent duplicate processing:

1. Problem Identified:
   - _recognitionTimer was repeatedly calling decode() on same audio data
   - Same recognition results were being sent multiple times to UI
   - Caused redundant processing and potential performance issues

2. Solution Implemented:
   - Add _lastRecognizedText state variable to track previous results
   - Only send recognition results when text content actually changes
   - Reset _lastRecognizedText when starting new recording session

3. Logic Changes:
   - Enhanced recognition loop with duplicate detection:
     A command-line utility for Dart development.

Usage: dart <command|dart-file> [arguments]

Global options:
-v, --verbose               Show additional command output.
    --version               Print the Dart SDK version.
    --enable-analytics      Enable analytics.
    --disable-analytics     Disable analytics.
    --suppress-analytics    Disallow analytics for this `dart *` run without changing the analytics configuration.
-h, --help                  Print this usage information.

Available commands:
  analyze    Analyze Dart code in a directory.
  compile    Compile Dart to various formats.
  create     Create a new Dart project.
  devtools   Open DevTools (optionally connecting to an existing application).
  doc        Generate API documentation for Dart projects.
  fix        Apply automated fixes to Dart source code.
  format     Idiomatically format Dart source code.
  info       Show diagnostic information about the installed tooling.
  pub        Work with packages.
  run        Run a Dart program.
  test       Run tests for a project.

Run "dart help <command>" for more information about a command.
See https://dart.dev/tools/dart-tool for detailed documentation.
   - Added debug logging for skipped duplicate results
   - Reset state on startListening() to ensure clean slate

4. Benefits:
   - Eliminates duplicate recognition results sent to UI
   - Reduces unnecessary computation and network overhead
   - Improves user experience with cleaner, non-repetitive updates
   - Better resource utilization and battery life

This fix addresses the core issue where the recognition timer was
processing the same audio stream content repeatedly, ensuring each
unique recognition result is only sent once to the application.
2025-09-09 16:58:47 +08:00
.dart_tool Fix Flutter assets configuration: explicitly specify all model files in pubspec.yaml 2025-09-06 17:36:54 +08:00
.vscode Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
android Fix Kotlin null safety issue: add null check for context in isAvailable method 2025-09-06 17:26:24 +08:00
build/ios/XCBuildData/PIFCache Improve demo: use RecordingButton component instead of custom implementation 2025-09-09 11:04:27 +08:00
coverage Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
doc Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
example Remove recognition history functionality 2025-09-09 16:22:38 +08:00
ios Fix Kotlin null safety issue: add null check for context in isAvailable method 2025-09-06 17:26:24 +08:00
lib Fix duplicate recognition issue in _recognitionTimer 2025-09-09 16:58:47 +08:00
scripts Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
test Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
CHANGELOG.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
LICENSE Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
README.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
SHERPA_ONNX_USAGE.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
debug_app.sh Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
pubspec.lock Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
pubspec.yaml Fix build issues: remove duplicate gradle files, increase heap size, fix plugin configuration 2025-09-06 17:23:14 +08:00
run_ios_app.sh Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
test_config.yaml Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
test_model_config.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
使用说明.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
权限问题解决指南.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
模型文件路径修复方案.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
测试用例说明.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
测试运行结果.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
简化使用指南.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00
音频文件测试报告.md Initial commit: Flutter speech-to-text plugin with Sherpa-ONNX integration 2025-08-27 17:09:36 +08:00

README.md

YX ASR - Flutter Speech-to-Text Plugin

基于 sherpa_onnx 的 Flutter 语音识别插件,提供完全离线的实时语音转文字功能。

特性

  • 🎤 实时语音识别: 边说边转换的实时转录功能
  • 🔄 切换录音: 简单的开始/停止录音,带有视觉反馈
  • 🌍 多语言支持: 支持中文、英文等多种语言
  • 📱 跨平台: 支持 iOS 和 Android 平台
  • 🎛️ 自定义UI: 灵活的录音按钮组件,支持自定义外观
  • 🔒 权限管理: 自动处理麦克风权限申请
  • 完全离线: 基于 sherpa_onnx无需网络连接
  • 🎯 高精度识别: 使用先进的神经网络模型
  • 🚀 低延迟: 实时处理,响应迅速
  • 🔐 隐私保护: 语音数据不会上传到云端

安装

在您的 pubspec.yaml 文件中添加依赖:

dependencies:
  yx_asr: ^1.0.0

然后运行:

flutter pub get

模型文件准备

由于使用 sherpa_onnx您需要下载对应的模型文件

  1. 中文模型 (推荐)

  2. 英文模型

    • 模型名称: sherpa-onnx-streaming-zipformer-en-2023-02-21
    • 解压到: assets/models/en-us/
  3. 模型文件结构

assets/models/
├── zh-cn/
│   ├── encoder.onnx
│   ├── decoder.onnx
│   ├── joiner.onnx
│   └── tokens.txt
└── en-us/
    ├── encoder.onnx
    ├── decoder.onnx
    ├── joiner.onnx
    └── tokens.txt

平台配置

Android

android/app/src/main/AndroidManifest.xml 中添加权限:

<uses-permission android:name="android.permission.RECORD_AUDIO" />

iOS

ios/Runner/Info.plist 中添加权限:

<key>NSMicrophoneUsageDescription</key>
<string>此应用需要麦克风权限来录制您的语音进行识别</string>

注意:由于使用 sherpa_onnx 进行离线识别,不需要网络权限和语音识别权限。

快速开始

基本使用

import 'package:yx_asr/yx_asr.dart';

class MyApp extends StatefulWidget {
  @override
  _MyAppState createState() => _MyAppState();
}

class _MyAppState extends State<MyApp> {
  final YxAsr _speechToText = YxAsr();
  String _recognizedText = '';
  bool _isListening = false;

  @override
  void initState() {
    super.initState();
    _initializeSpeechToText();
  }

  Future<void> _initializeSpeechToText() async {
    // 使用中文模型初始化
    bool initialized = await _speechToText.initializeWithModel('assets/models/zh-cn');

    if (initialized) {
      // 监听识别结果
      _speechToText.onResult.listen((result) {
        setState(() {
          _recognizedText = result.recognizedWords;
        });
      });

      // 监听错误
      _speechToText.onError.listen((error) {
        print('语音识别错误: ${error.errorMsg}');
      });

      // 监听状态变化
      _speechToText.onListeningStatusChanged.listen((isListening) {
        setState(() {
          _isListening = isListening;
        });
      });
    }
  }

  Future<void> _toggleRecording() async {
    if (_isListening) {
      await _speechToText.stopListening();
    } else {
      await _speechToText.startListening(
        partialResults: true, // 启用部分结果
      );
    }
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: Text('语音识别')),
      body: Column(
        children: [
          Text('识别结果: $_recognizedText'),
          ElevatedButton(
            onPressed: _toggleRecording,
            child: Text(_isListening ? '停止' : '开始'),
          ),
        ],
      ),
    );
  }
}

Using the Recording Button Widget

The plugin includes a customizable RecordingButton widget:

import 'package:yx_asr/yx_asr.dart';

RecordingButton(
  onResult: (result) {
    print('Result: ${result.recognizedWords}');
  },
  onError: (error) {
    print('Error: ${error.errorMsg}');
  },
  onListeningStatusChanged: (isListening) {
    print('Listening: $isListening');
  },
  localeId: 'en-US',
  partialResults: true,
  size: 80.0,
  tooltip: 'Tap to record',
)

API Reference

YxAsr Class

Methods

  • Future<bool> initialize() - Initialize the speech recognition service
  • Future<bool> isAvailable() - Check if speech recognition is available
  • Future<bool> hasPermission() - Check if microphone permission is granted
  • Future<bool> requestPermission() - Request microphone permission
  • Future<void> startListening({String localeId, bool partialResults, bool onDevice}) - Start listening
  • Future<void> stopListening() - Stop listening and get final result
  • Future<void> cancel() - Cancel current recognition session
  • Future<bool> get isListening - Check if currently listening

Streams

  • Stream<SpeechRecognitionResult> onResult - Stream of recognition results
  • Stream<SpeechRecognitionError> onError - Stream of recognition errors
  • Stream<bool> onListeningStatusChanged - Stream of listening status changes

SpeechRecognitionResult

class SpeechRecognitionResult {
  final String recognizedWords;    // The recognized text
  final bool finalResult;          // Whether this is a final result
  final double confidence;         // Confidence level (0.0 to 1.0)
  final List<String> alternatives; // Alternative recognition results
}

SpeechRecognitionError

class SpeechRecognitionError {
  final SpeechRecognitionErrorType errorType; // Type of error
  final String errorMsg;                      // Human-readable error message
  final String? errorCode;                    // Platform-specific error code
}

RecordingButton Widget

Properties

  • onResult - Callback for recognition results
  • onError - Callback for recognition errors
  • onListeningStatusChanged - Callback for status changes
  • localeId - Language locale (default: 'en-US')
  • partialResults - Enable partial results (default: true)
  • onDevice - Use on-device recognition on iOS (default: false)
  • size - Button size (default: 80.0)
  • idleColor - Button color when not recording
  • recordingColor - Button color when recording
  • disabledColor - Button color when disabled
  • enabled - Whether the button is enabled (default: true)
  • tooltip - Tooltip text

Supported Languages

The plugin supports multiple languages including:

  • English (en-US, en-GB)
  • Chinese (zh-CN, zh-TW)
  • Japanese (ja-JP)
  • Korean (ko-KR)
  • Spanish (es-ES)
  • French (fr-FR)
  • German (de-DE)
  • Italian (it-IT)

Error Handling

The plugin provides comprehensive error handling through the SpeechRecognitionError class:

_speechToText.onError.listen((error) {
  switch (error.errorType) {
    case SpeechRecognitionErrorType.permissionDenied:
      // Handle permission denied
      break;
    case SpeechRecognitionErrorType.network:
      // Handle network errors
      break;
    case SpeechRecognitionErrorType.noSpeech:
      // Handle no speech detected
      break;
    // ... handle other error types
  }
});

Best Practices

  1. Always check permissions before starting recognition
  2. Handle errors gracefully to provide good user experience
  3. Use partial results for real-time feedback
  4. Stop listening when done to conserve battery
  5. Test on real devices as speech recognition doesn't work well on simulators

Example App

Check out the example/ directory for a comprehensive example app that demonstrates:

  • Real-time speech recognition
  • Multiple language support
  • Error handling
  • Recognition history
  • Customizable settings

Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests to our repository.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Troubleshooting

Common Issues

  1. Permission Denied Error

    • Ensure microphone permissions are added to platform manifests
    • Call requestPermission() before starting recognition
  2. Speech Recognition Not Available

    • Check if device supports speech recognition with isAvailable()
    • Ensure Google app is installed and updated on Android
  3. No Speech Detected

    • Check microphone hardware
    • Ensure app has microphone permission
    • Try speaking louder or closer to the microphone
  4. Network Errors

    • Check internet connectivity
    • Some platforms require network for speech recognition

Testing

  • Speech recognition doesn't work well on simulators/emulators
  • Always test on real devices
  • Test in quiet environments for better accuracy

Support

For issues and feature requests, please use the GitHub issue tracker.