Thank you for your interest in contributing to BatchFlow! This document provides guidelines and information for contributors.
- Go 1.20 or later
- Docker and Docker Compose (for integration tests)
- Git
-
Fork and Clone
git clone https://github.com/rushairer/batchflow.git cd batchflow -
Install Dependencies
go mod download
-
Verify Setup
make test-unit make lint
git checkout -b feature/your-feature-name
# or
git checkout -b fix/issue-number- Write clean, well-documented code
- Follow Go best practices and project conventions
- Add tests for new functionality
- Update documentation as needed
# Run unit tests
make test-unit
# Run linting
make lint
# Run integration tests (optional but recommended)
make docker-sqlite-test
make docker-mysql-test
make docker-postgres-test
make docker-redis-testgit add .
git commit -m "feat: add new feature description"
# or
git commit -m "fix: resolve issue description"Commit Message Format:
feat:- New featuresfix:- Bug fixesdocs:- Documentation changestest:- Test additions or modificationsrefactor:- Code refactoringperf:- Performance improvementschore:- Maintenance tasks
git push origin your-branch-nameThen create a Pull Request on GitHub.
- Write tests for all new functions and methods
- Aim for at least 80% code coverage
- Use table-driven tests where appropriate
- Mock external dependencies
Example:
func TestBatchFlow_Submit(t *testing.T) {
tests := []struct {
name string
request *Request
wantErr bool
}{
{
name: "valid request",
request: NewRequest(schema).SetString("name", "test"),
wantErr: false,
},
// Add more test cases
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Test implementation
})
}
}- Test real database interactions
- Verify performance characteristics
- Test error handling and edge cases
- Use Docker containers for consistent environments
- Add benchmarks for performance-critical code
- Monitor memory allocations
- Test with realistic data volumes
Example:
func BenchmarkBatchFlow_Submit(b *testing.B) {
batch, _ := NewBatchFlowWithMock(ctx, config)
request := NewRequest(schema).SetString("name", "test")
b.ResetTimer()
for i := 0; i < b.N; i++ {
batch.Submit(ctx, request)
}
}- Follow standard Go formatting (
go fmt) - Use meaningful variable and function names
- Write clear, concise comments
- Keep functions small and focused
- Handle errors appropriately
- Add GoDoc comments for public functions and types
- Update README.md for significant changes
- Include code examples in documentation
- Document configuration options and their effects
// Good: Specific error types
type ValidationError struct {
Field string
Message string
}
func (e *ValidationError) Error() string {
return fmt.Sprintf("validation error in field %s: %s", e.Field, e.Message)
}
// Good: Contextual error wrapping
if err := validateRequest(req); err != nil {
return fmt.Errorf("failed to validate request: %w", err)
}基于重构后的架构设计 - 版本 v1.3.0
BatchFlow 采用灵活的分层架构,通过统一的 BatchExecutor 接口支持不同类型的数据源:
- SQL数据库: 使用
ThrottledBatchExecutor+BatchProcessor+SQLDriver - NoSQL数据库: 直接实现
BatchExecutor接口 - 消息推送/API调用: 直接实现
BatchExecutor接口,支持各种自定义批量任务 - 测试环境: 使用
MockExecutor直接实现
-
实现SQLDriver接口:
// drivers/newdb/driver.go type NewDBDriver struct{} func (d *NewDBDriver) GenerateInsertSQL(schema batchflow.SchemaInterface, data []map[string]any) (string, []any, error) { // 生成数据库特定的SQL语句 // 处理冲突策略:ConflictIgnore, ConflictReplace, ConflictUpdate return sql, args, nil }
-
创建执行器工厂:
// drivers/newdb/executor.go func NewBatchExecutor(db *sql.DB) *batchflow.ThrottledBatchExecutor { return batchflow.NewSQLThrottledBatchExecutorWithDriver(db, &NewDBDriver{}) } func NewBatchExecutorWithDriver(db *sql.DB, driver batchflow.SQLDriver) *batchflow.ThrottledBatchExecutor { return batchflow.NewSQLThrottledBatchExecutorWithDriver(db, driver) }
-
添加BatchFlow工厂方法:
// batchflow.go func NewNewDBBatchFlow(ctx context.Context, db *sql.DB, config PipelineConfig) *BatchFlow { executor := newdb.NewBatchExecutor(db) return NewBatchFlow(ctx, config.BufferSize, config.FlushSize, config.FlushInterval, executor) }
-
直接实现BatchExecutor接口:
// drivers/newnosql/executor.go type Executor struct { client *NewNoSQLClient } func (e *Executor) ExecuteBatch(ctx context.Context, schema batchflow.SchemaInterface, data []map[string]any) error { // 直接实现数据库特定的批量操作 // 无需经过BatchProcessor层 return nil }
-
创建工厂方法:
func NewBatchExecutor(client *NewNoSQLClient) *Executor { return &Executor{client: client} }
-
添加BatchFlow工厂方法:
func NewNewNoSQLBatchFlow(ctx context.Context, client *NewNoSQLClient, config PipelineConfig) *BatchFlow { executor := newnosql.NewBatchExecutor(client) return NewBatchFlow(ctx, config.BufferSize, config.FlushSize, config.FlushInterval, executor) }
-
单元测试:
func TestNewDBDriver_GenerateInsertSQL(t *testing.T) { driver := &NewDBDriver{} schema := &batchflow.Schema{ Name: "test_table", Columns: []string{"id", "name"}, ConflictStrategy: batchflow.ConflictIgnore, } data := []map[string]any{ {"id": 1, "name": "test"}, } sql, args, err := driver.GenerateInsertSQL(schema, data) assert.NoError(t, err) assert.Contains(t, sql, "INSERT") assert.Len(t, args, 2) }
-
集成测试:
func TestNewDBBatchFlow_Integration(t *testing.T) { db := setupTestDB(t) // 设置测试数据库 defer db.Close() config := PipelineConfig{ BufferSize: 100, FlushSize: 10, FlushInterval: time.Second, } batch := NewNewDBBatchFlow(ctx, db, config) // 测试批量插入 schema := NewSQLSchema("test_table", batchflow.ConflictIgnoreOperationConfig, "id", "name") request := NewRequest(schema).SetInt64("id", 1).SetString("name", "test") err := batch.Submit(ctx, request) assert.NoError(t, err) // 验证数据插入 // ... }
-
选择合适的实现方式:
- SQL数据库:使用 ThrottledBatchExecutor 架构,复用通用逻辑
- NoSQL数据库:直接实现BatchExecutor,避免不必要的抽象
-
性能优化:
- 使用数据库特定的批量操作API
- 避免在热路径中进行内存分配
- 利用数据库的Pipeline或Batch特性
-
错误处理:
- 提供清晰的错误信息
- 区分临时错误和永久错误
- 支持错误重试机制
-
指标收集:
- 实现MetricsReporter接口
- 记录执行时间、批次大小、成功/失败状态
- 提供数据库特定的指标
- Use pointer receivers for methods
- Minimize memory allocations in hot paths
- Consider using sync.Pool for frequently allocated objects
- Profile code to identify bottlenecks
- Check existing issues first
- Use the bug report template
- Provide minimal reproduction case
- Include environment details
- Add relevant logs and error messages
- Use the feature request template
- Explain the use case and problem
- Propose a solution
- Consider backwards compatibility
- Discuss API design implications
- All tests pass
- Code coverage maintained or improved
- Documentation updated
- No linting errors
- Backwards compatibility preserved (unless breaking change is justified)
- Functionality: Does the code work as intended?
- Performance: Are there any performance regressions?
- Security: Are there any security implications?
- Maintainability: Is the code easy to understand and maintain?
- Testing: Are there adequate tests?
- Documentation: Is the documentation clear and complete?
- Code formatting (
go fmt) - Linting (
golangci-lint) - Unit tests with coverage
- Integration tests (MySQL, PostgreSQL, SQLite)
- Performance benchmarks
- Test with different Go versions
- Verify on different operating systems
- Test with various database versions
- Performance testing under load
- Performance Optimization: Improving throughput and reducing latency
- Error Handling: Better error messages and recovery mechanisms
- Documentation: Comprehensive guides and examples
- Testing: Increasing test coverage and reliability
- Additional database support (TiDB, ClickHouse)
- Monitoring and metrics integration
- Connection pool optimization
- Advanced batching strategies
- Be respectful and inclusive
- Provide constructive feedback
- Help newcomers get started
- Focus on technical merit
- Maintain professional communication
- Check existing documentation first
- Search closed issues for similar problems
- Ask questions in GitHub Discussions
- Provide context and examples when asking for help
- README.md - Project overview and basic usage
- CONFIG.md - Configuration options
- README-INTEGRATION-TESTS.md - Integration testing guide
- golangci-lint - Go linting
- Docker - Containerization
- Make - Build automation
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Security: Report security issues privately via email
Thank you for contributing to BatchFlow! Your efforts help make this project better for everyone. 🙏