Add keywords CLI tool for text vectorization

## Summary
Provide a separate command-line tool for keyword extraction and term analysis using TF-IDF.

## Motivation
TF-IDF is a vectorizer, not a classifier. It transforms text into weighted term vectors. Cramming it into the `classifier` CLI would be dishonest about what the tool does.

A dedicated `keywords` tool enables:
- Keyword extraction from documents
- Understanding term importance
- Building vocabularies for other tools
- Document similarity analysis
- Preprocessing pipelines

## Proposed CLI

### Fit (build vocabulary)
```bash
# Build vocabulary from files
keywords fit corpus/*.txt

# From stdin
cat documents.txt | keywords fit

# Custom model path
keywords fit -m vocab.json corpus/*.txt
```

### Transform (extract terms)
```bash
# Get weighted terms from text
keywords "Ruby is a programming language"
# => ruby:0.52 programming:0.41 language:0.38

# From stdin
echo "some document text" | keywords
# => document:0.45 text:0.42

# Top N terms only
keywords -n 5 "long document with many terms..."
# => term1:0.5 term2:0.4 term3:0.3 term4:0.2 term5:0.1
```

### Extract (convenience alias)
```bash
# Extract keywords from a file
keywords extract article.txt
# => machine:0.61 learning:0.58 neural:0.45 network:0.42

# From URL (with curl)
curl -s https://example.com/article | keywords extract
```

### Info
```bash
keywords info
# => Documents: 1,234
# => Vocabulary: 5,678
# => Min DF: 1
# => Max DF: 1.0
```

## Options

```bash
-m, --model FILE    Model file (default: ./keywords.json)
-n, --top N         Show top N terms only
-q                  Quiet mode (for scripting)
-v, --version       Show version
-h, --help          Show help
```

### Fit-specific options
```bash
--min-df N          Minimum document frequency (default: 1)
--max-df N          Maximum document frequency ratio (default: 1.0)
--ngram MIN,MAX     N-gram range (default: 1,1)
```

## Examples

```bash
# Build vocabulary and extract keywords
keywords fit articles/*.txt
keywords "What are the main topics?"
# => topics:0.6 main:0.4

# Pipeline with classifier
keywords fit corpus.txt
keywords extract article.txt | head -5  # top 5 terms

# Compare documents (output as TSV for scripting)
keywords -q doc1.txt > /tmp/v1.txt
keywords -q doc2.txt > /tmp/v2.txt
# Then use external tool for cosine similarity
```

## Design Principles

1. **Separate tool**: TF-IDF is not classification, don't pretend it is
2. **Transform is default**: No subcommand needed for primary action
3. **Stdin works**: Pipe-friendly
4. **Scriptable output**: `-q` for machine-readable format

## Implementation Notes
- Use `optparse` (stdlib)
- Reuse existing `Classifier::TFIDF` class (when implemented)
- Exit codes: 0 success, 1 error, 2 usage error
- Default model: `./keywords.json`

## Related
- #119 - classifier CLI (separate tool, same principles)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add keywords CLI tool for text vectorization #122

Summary

Motivation

Proposed CLI

Fit (build vocabulary)

Transform (extract terms)

Extract (convenience alias)

Info

Options

Fit-specific options

Examples

Design Principles

Implementation Notes

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add keywords CLI tool for text vectorization #122

Description

Summary

Motivation

Proposed CLI

Fit (build vocabulary)

Transform (extract terms)

Extract (convenience alias)

Info

Options

Fit-specific options

Examples

Design Principles

Implementation Notes

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions