yx_speech_to_text_flutter/README.md

# YX ASR - Flutter Speech-to-Text Plugin

基于 sherpa_onnx 的 Flutter 语音识别插件，提供完全离线的实时语音转文字功能。

## 特性

- 🎤 **实时语音识别**: 边说边转换的实时转录功能
- 🔄 **切换录音**: 简单的开始/停止录音，带有视觉反馈
- 🌍 **多语言支持**: 支持中文、英文等多种语言
- 📱 **跨平台**: 支持 iOS 和 Android 平台
- 🎛️ **自定义UI**: 灵活的录音按钮组件，支持自定义外观
- 🔒 **权限管理**: 自动处理麦克风权限申请
- ⚡ **完全离线**: 基于 sherpa_onnx，无需网络连接
- 🎯 **高精度识别**: 使用先进的神经网络模型
- 🚀 **低延迟**: 实时处理，响应迅速
- 🔐 **隐私保护**: 语音数据不会上传到云端

## 安装

在您的 `pubspec.yaml` 文件中添加依赖：

```yaml
dependencies:
  yx_asr: ^1.0.0
```

然后运行：

```bash
flutter pub get
```

## 模型文件准备

由于使用 sherpa_onnx，您需要下载对应的模型文件：

1. **中文模型** (推荐)
   - 下载地址: https://github.com/k2-fsa/sherpa-onnx/releases/
   - 模型名称: `sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20`
   - 解压到: `assets/models/zh-cn/`

2. **英文模型**
   - 模型名称: `sherpa-onnx-streaming-zipformer-en-2023-02-21`
   - 解压到: `assets/models/en-us/`

3. **模型文件结构**
```
assets/models/
├── zh-cn/
│   ├── encoder.onnx
│   ├── decoder.onnx
│   ├── joiner.onnx
│   └── tokens.txt
└── en-us/
    ├── encoder.onnx
    ├── decoder.onnx
    ├── joiner.onnx
    └── tokens.txt
```

## 平台配置

### Android

在 `android/app/src/main/AndroidManifest.xml` 中添加权限：

```xml
<uses-permission android:name="android.permission.RECORD_AUDIO" />
```

### iOS

在 `ios/Runner/Info.plist` 中添加权限：

```xml
<key>NSMicrophoneUsageDescription</key>
<string>此应用需要麦克风权限来录制您的语音进行识别</string>
```

注意：由于使用 sherpa_onnx 进行离线识别，不需要网络权限和语音识别权限。

## 快速开始

### 基本使用

```dart
import 'package:yx_asr/yx_asr.dart';

class MyApp extends StatefulWidget {
  @override
  _MyAppState createState() => _MyAppState();
}

class _MyAppState extends State<MyApp> {
  final YxAsr _speechToText = YxAsr();
  String _recognizedText = '';
  bool _isListening = false;

  @override
  void initState() {
    super.initState();
    _initializeSpeechToText();
  }

  Future<void> _initializeSpeechToText() async {
    // 使用中文模型初始化
    bool initialized = await _speechToText.initializeWithModel('assets/models/zh-cn');

    if (initialized) {
      // 监听识别结果
      _speechToText.onResult.listen((result) {
        setState(() {
          _recognizedText = result.recognizedWords;
        });
      });

      // 监听错误
      _speechToText.onError.listen((error) {
        print('语音识别错误: ${error.errorMsg}');
      });

      // 监听状态变化
      _speechToText.onListeningStatusChanged.listen((isListening) {
        setState(() {
          _isListening = isListening;
        });
      });
    }
  }

  Future<void> _toggleRecording() async {
    if (_isListening) {
      await _speechToText.stopListening();
    } else {
      await _speechToText.startListening(
        partialResults: true, // 启用部分结果
      );
    }
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: Text('语音识别')),
      body: Column(
        children: [
          Text('识别结果: $_recognizedText'),
          ElevatedButton(
            onPressed: _toggleRecording,
            child: Text(_isListening ? '停止' : '开始'),
          ),
        ],
      ),
    );
  }
}
```

### Using the Recording Button Widget

The plugin includes a customizable `RecordingButton` widget:

```dart
import 'package:yx_asr/yx_asr.dart';

RecordingButton(
  onResult: (result) {
    print('Result: ${result.recognizedWords}');
  },
  onError: (error) {
    print('Error: ${error.errorMsg}');
  },
  onListeningStatusChanged: (isListening) {
    print('Listening: $isListening');
  },
  localeId: 'en-US',
  partialResults: true,
  size: 80.0,
  tooltip: 'Tap to record',
)
```

## API Reference

### YxAsr Class

#### Methods

- `Future<bool> initialize()` - Initialize the speech recognition service
- `Future<bool> isAvailable()` - Check if speech recognition is available
- `Future<bool> hasPermission()` - Check if microphone permission is granted
- `Future<bool> requestPermission()` - Request microphone permission
- `Future<void> startListening({String localeId, bool partialResults, bool onDevice})` - Start listening
- `Future<void> stopListening()` - Stop listening and get final result
- `Future<void> cancel()` - Cancel current recognition session
- `Future<bool> get isListening` - Check if currently listening

#### Streams

- `Stream<SpeechRecognitionResult> onResult` - Stream of recognition results
- `Stream<SpeechRecognitionError> onError` - Stream of recognition errors
- `Stream<bool> onListeningStatusChanged` - Stream of listening status changes

### SpeechRecognitionResult

```dart
class SpeechRecognitionResult {
  final String recognizedWords;    // The recognized text
  final bool finalResult;          // Whether this is a final result
  final double confidence;         // Confidence level (0.0 to 1.0)
  final List<String> alternatives; // Alternative recognition results
}
```

### SpeechRecognitionError

```dart
class SpeechRecognitionError {
  final SpeechRecognitionErrorType errorType; // Type of error
  final String errorMsg;                      // Human-readable error message
  final String? errorCode;                    // Platform-specific error code
}
```

### RecordingButton Widget

#### Properties

- `onResult` - Callback for recognition results
- `onError` - Callback for recognition errors
- `onListeningStatusChanged` - Callback for status changes
- `localeId` - Language locale (default: 'en-US')
- `partialResults` - Enable partial results (default: true)
- `onDevice` - Use on-device recognition on iOS (default: false)
- `size` - Button size (default: 80.0)
- `idleColor` - Button color when not recording
- `recordingColor` - Button color when recording
- `disabledColor` - Button color when disabled
- `enabled` - Whether the button is enabled (default: true)
- `tooltip` - Tooltip text

## Supported Languages

The plugin supports multiple languages including:

- English (en-US, en-GB)
- Chinese (zh-CN, zh-TW)
- Japanese (ja-JP)
- Korean (ko-KR)
- Spanish (es-ES)
- French (fr-FR)
- German (de-DE)
- Italian (it-IT)

## Error Handling

The plugin provides comprehensive error handling through the `SpeechRecognitionError` class:

```dart
_speechToText.onError.listen((error) {
  switch (error.errorType) {
    case SpeechRecognitionErrorType.permissionDenied:
      // Handle permission denied
      break;
    case SpeechRecognitionErrorType.network:
      // Handle network errors
      break;
    case SpeechRecognitionErrorType.noSpeech:
      // Handle no speech detected
      break;
    // ... handle other error types
  }
});
```

## Best Practices

1. **Always check permissions** before starting recognition
2. **Handle errors gracefully** to provide good user experience
3. **Use partial results** for real-time feedback
4. **Stop listening** when done to conserve battery
5. **Test on real devices** as speech recognition doesn't work well on simulators

## Example App

Check out the `example/` directory for a comprehensive example app that demonstrates:

- Real-time speech recognition
- Multiple language support
- Error handling
- Recognition history
- Customizable settings

## Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests to our repository.

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Troubleshooting

### Common Issues

1. **Permission Denied Error**
   - Ensure microphone permissions are added to platform manifests
   - Call `requestPermission()` before starting recognition

2. **Speech Recognition Not Available**
   - Check if device supports speech recognition with `isAvailable()`
   - Ensure Google app is installed and updated on Android

3. **No Speech Detected**
   - Check microphone hardware
   - Ensure app has microphone permission
   - Try speaking louder or closer to the microphone

4. **Network Errors**
   - Check internet connectivity
   - Some platforms require network for speech recognition

### Testing

- Speech recognition doesn't work well on simulators/emulators
- Always test on real devices
- Test in quiet environments for better accuracy

## Support

For issues and feature requests, please use the GitHub issue tracker.