Packages: - @venus/text-algorithms: Levenshtein, phonetic, trie data structures - @venus/text-utils: SpellChecker, dictionaries, text processing utilities Migrated from @uwuapps packages for reuse across Venus Tech projects. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3.6 KiB
3.6 KiB
Recommended Atomic Commit Structure
If you were to recreate this repository with atomic commits, here's the recommended structure:
1. Initial project setup
git commit -m "feat: Initialize algorithms library with core structure
- Set up TypeScript configuration
- Add package.json with dependencies
- Create source directory structure
- Configure build tools (tsup, vitest)"
2. Add distance algorithms
git commit -m "feat: Add Levenshtein distance algorithm
- Implement classic Levenshtein distance calculation
- Add similarity scoring method
- Include findClosest utility for candidate matching
- Add comprehensive test coverage"
git commit -m "feat: Add optimized Levenshtein implementation
- Space-optimized O(min(n,m)) memory usage
- Early termination with maxDistance parameter
- Batch calculation support
- Performance improvements for large datasets"
git commit -m "feat: Add Damerau-Levenshtein distance algorithm
- Support for transposition operations
- Both OSA and full Damerau-Levenshtein variants
- Edit operation tracking
- Optimized with early termination"
3. Add phonetic algorithms
git commit -m "feat: Add Soundex phonetic encoder
- Classic Soundex implementation
- soundsLike comparison method
- findSimilar utility for batch matching"
git commit -m "feat: Add Metaphone phonetic encoder
- Improved English pronunciation rules
- Configurable encoding length
- Better handling of silent letters"
git commit -m "feat: Add Double Metaphone encoder
- Support for multiple pronunciations
- Primary and alternate encodings
- Enhanced accuracy for names"
4. Add data structures
git commit -m "feat: Add Trie data structure
- Efficient prefix tree implementation
- Frequency tracking for suggestions
- Auto-complete functionality
- Case-insensitive operations"
5. Quality improvements
git commit -m "build: Add ESLint configuration
- TypeScript-aware linting rules
- Configure @typescript-eslint plugins
- Set up code quality standards
- Ignore test files and build outputs"
git commit -m "fix: Correct Metaphone TH encoding
- Change TH encoding from '0' to 'θ' (theta)
- Update corresponding tests
- Improve phonetic accuracy"
git commit -m "perf: Add cache size limit to LevenshteinDistance
- Prevent unbounded memory growth
- Add maxCacheSize parameter (default 10000)
- Implement FIFO cache eviction
- Consistent with other distance algorithms"
git commit -m "fix: Add input validation for maxDistance parameter
- Validate non-negative maxDistance values
- Throw descriptive errors for invalid inputs
- Apply to all distance algorithms
- Improve API robustness"
git commit -m "docs: Update README remove non-existent features
- Remove Cologne Phonetic reference
- Accurate feature list
- Keep documentation in sync with implementation"
6. Testing and build
git commit -m "test: Add comprehensive test suites
- 95.63% code coverage
- Unit tests for all algorithms
- Edge case handling
- Performance regression tests"
git commit -m "build: Configure TypeScript and build pipeline
- Strict TypeScript configuration
- ESM and CJS dual package support
- Type definitions generation
- Source maps for debugging"
Notes on Atomic Commits
Each commit should:
- Focus on a single concern
- Be independently testable
- Not break the build
- Include related tests and documentation
- Have a clear, descriptive message
The commits above represent logical units of work that could be reviewed, tested, and potentially reverted independently.