Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
- enhance the `dump` command with a worker pool for concurrency and thread-safe data writing mechanics.
- include comprehensive unit test coverage for the validation and concurrency logic in fields_test.go and dump_test.go.
- update README.md and README_ZH.md to natively reference the official FOFA API documentation URLs for valid fields.
- optimize `search` command by adding `-bs` parameter to customize batch size for returned data.

## v0.2.28 fix dedup mode

Expand Down
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ categories:
| urlPrefix | | http:// | URL prefix |
| full | | false | Retrieves full data |
| uniqByIP | | false | Removes duplicates based on IP |
| batchSize | bs | 1000 | Pagination size per fetch *3 |
| workers | | 10 | Number of threads |
| rate | | 2 | Query rate per second |
| template | | ip={} | Replaces `{}` with content from pipeline input |
Expand All @@ -158,27 +159,29 @@ categories:

*1: When the query contains `cert` and `banner`, the maximum results size setting is 2000 per page.
*2: When the query contains `body`, the maximum results size setting is 500 per page.
*3: When the `body` field is included, the default `batchSize` is automatically capped at 500. If the `-bs` parameter is explicitly set, the set value will be used instead.

### `dump`

| Parameter | Abbreviation | Default Value | Description |
|-------------|--------------|---------------|-----------------------------------------------------------|
| fields | f | ip,port | Fields returned by FOFA, valid fields refer to https://en.fofa.info/api, for dump command refer to https://en.fofa.info/api/batches_pages |
| fields | f | ip,port | Fields returned by FOFA, valid fields refer to https://en.fofa.info/api/batches_pages |
| format | | csv | Output format: csv/json/xml |
| outFile | o | | Output file. If not set, prints to terminal |
| inFile | i | | Input file. If not set, reads from pipeline input |
| size | s | 100 | Query size. No upper limit but consumes f-points or free quota *1*2 |
| fixUrl | | false | Combines URLs (e.g., 1.1.1.1,80 becomes http://1.1.1.1) |
| urlPrefix | | http:// | URL prefix |
| full | | false | Retrieves full data |
| batchSize | bs | 1000 | Number of records to fetch per batch |
| batchSize | bs | 1000 | Number of records to fetch per batch *3 |
| batchType | bt | | Batch query type: ip/domain |
| workers | | 10 | Number of threads, defaults to 10 when using -i |
| rate | | 2 | Query rate per second |
| help | h | false | Displays usage information |

*1: When the query contains `cert` and `banner`, the maximum results size setting is 2000 per page.
*2: When the query contains `body`, the maximum results size setting is 500 per page.
*3: When the `body` field is included, the `batchSize` is automatically capped at 500.

### `jsRender`

Expand Down
7 changes: 5 additions & 2 deletions README_ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,7 @@ categories:
| urlPrefix | | http:// | url前缀 |
| full | | false | 是否调取全量数据 |
| uniqByIP | | false | 是否根据ip去重 |
| batchSize | bs | 1000 | 每次拉取的分页大小 *3 |
| workers | | 10 | 线程数量 |
| rate | | 2 | 每秒查询次数 |
| template | | ip={} | 从管道获取输入,输入的内容会替换{} |
Expand All @@ -161,27 +162,29 @@ categories:

*1:当获取字段包含 `cert` 和 `banner` 时,单次查询 size 最大支持 2000。
*2:当获取字段包含 `body` 时,单次查询 size 最大支持 500。
*3:当获取字段包含 `body` 时,默认的 `batchSize` 会自动限制为 500。如果手动设置了 `-bs` 参数,则以设置的值为准。

### dump

| 参数 | 参数简写 | 默认值 | 简介 |
| --------- | -------- | ------- | ----------------------------------------------------- |
| fields | f | ip,port | FOFA返回的字段选择,有效字段参考https://fofa.info/api,dump的参考https://fofa.info/api/batches_pages |
| fields | f | ip,port | FOFA返回的字段选择,有效字段参考https://fofa.info/api/batches_pages |
| format | | csv | 输出格式,可以为csv/json/xml |
| outFile | o | | 输出文件,如果不设置则终端打印 |
| inFile | i | | 输入文件,如果不设置则读取管道输入 |
| size | s | 100 | 查询数量,无上限,但要扣除f点或免费数量 *1*2 |
| fixUrl | | false | 是否组合url,例如1.1.1.1,80组合为http://1.1.1.1 |
| urlPrefix | | http:// | url前缀 |
| full | | false | 是否调取全量数据 |
| batchSize | bs | 1000 | 每次拉取多少条数据 |
| batchSize | bs | 1000 | 每次拉取多少条数据 *3 |
| batchType | bt | | 批量查询,可以为ip/domain |
| workers | | 10 | 线程数量,当使用-i时默认10 |
| rate | | 2 | 每秒查询次数 |
| help | h | false | 使用方法 |

*1:当获取字段包含 `cert` 和 `banner` 时,单次查询 size 最大支持 2000。
*2:当获取字段包含 `body` 时,单次查询 size 最大支持 500。
*3:当获取字段包含 `body` 时,每次拉取的 `batchSize` 会自动限制为 500。

### jsRender

Expand Down
9 changes: 7 additions & 2 deletions USER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -635,12 +635,13 @@ $ fofa --version
| fields | f | ip,port | FOFA fields to retrieve. [Learn More](https://en.fofa.info/vip) |
| format | | csv | Output format: csv/json/xml |
| outFile | o | | Output file. If not set, prints to terminal |
| size | s | 100 | Query size. Maximum is 10,000, limited by `deductMode` |
| size | s | 100 | Query size. Maximum is 10,000, limited by `deductMode` |
| deductMode | | | Determines consumption of f-points. Uses free quota by default |
| fixUrl | | false | Concatenates URLs (e.g., `1.1.1.1,80` → `http://1.1.1.1`) |
| urlPrefix | | http:// | URL prefix |
| full | | false | Retrieves full data |
| uniqByIP | | false | Removes duplicates by IP |
| batchSize | bs | 1000 | Pagination size per fetch *1 |
| workers | | 10 | Number of threads |
| rate | | 2 | Queries per second |
| template | | ip={} | Replaces `{}` with content from pipeline input |
Expand All @@ -652,6 +653,8 @@ $ fofa --version
| headline | | false | Outputs CSV headers (only applicable for CSV format) |
| customFields | cf | | use custom fields |
| help | h | false | Displays usage instructions |

*1: When the `body` field is included, the default `batchSize` is automatically capped at 500. If the `-bs` parameter is explicitly set, the set value will be used instead.

### Dump

Expand All @@ -665,11 +668,13 @@ $ fofa --version
| fixUrl | | false | Concatenates URLs (e.g., `1.1.1.1,80` → `http://1.1.1.1`) |
| urlPrefix | | http:// | URL prefix |
| full | | false | Retrieves full data |
| batchSize | bs | 1000 | Number of records fetched per batch |
| batchSize | bs | 1000 | Number of records fetched per batch *1 |
| batchType | bt | | Batch query type: ip/domain |
| customFields | cf | | use custom fields |
| help | h | false | Displays usage instructions |

*1: When the `body` field is included, the `batchSize` is automatically capped at 500.

### jsRender

| Parameter | Abbreviation | Default Value | Description |
Expand Down
7 changes: 6 additions & 1 deletion USER_GUIDE_ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -637,6 +637,7 @@ $ fofa --version
| urlPrefix | | http:// | url前缀 |
| full | | false | 是否调取全量数据 |
| uniqByIP | | false | 是否根据ip去重 |
| batchSize | bs | 1000 | 每次拉取的分页大小 *1 |
| workers | | 10 | 线程数量 |
| rate | | 2 | 每秒查询次数 |
| template | | ip={} | 从管道获取输入,输入的内容会替换{} |
Expand All @@ -649,6 +650,8 @@ $ fofa --version
| customFields | cf | | 使用自定义fields字段 |
| help | h | false | 使用方法 |

*1:当获取字段包含 `body` 时,默认的 `batchSize` 会自动限制为 500。如果手动设置了 `-bs` 参数,则以设置的值为准。

### dump

| 参数 | 参数简写 | 默认值 | 简介 |
Expand All @@ -661,11 +664,13 @@ $ fofa --version
| fixUrl | | false | 是否组合url,例如1.1.1.1,80组合为http://1.1.1.1 |
| urlPrefix | | http:// | url前缀 |
| full | | false | 是否调取全量数据 |
| batchSize | bs | 1000 | 每次拉取多少条数据 |
| batchSize | bs | 1000 | 每次拉取多少条数据 *1 |
| batchType | bt | | 批量查询,可以为ip/domain |
| customFields | cf | | 使用自定义fields字段 |
| help | h | false | 使用方法 |

*1:当获取字段包含 `body` 时,每次拉取的 `batchSize` 会自动限制为 500。

### jsRender

| 参数 | 参数简写 | 默认值 | 简介 |
Expand Down
22 changes: 22 additions & 0 deletions client_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,26 @@ var (
w.Write([]byte(`{"error":false,"size":470293950,"page":1,"mode":"extended","query":"port=\"80\"","results":[["1.1.1.1:81","1.1.1.1","81"],["1.1.1.1:82","1.1.1.1","82"],["1.1.1.1:83","1.1.1.1","83"],["1.1.1.1:84","1.1.1.1","84"],["1.1.1.1:85","1.1.1.1","85"],["1.1.1.1:86","1.1.1.1","86"],["1.1.1.1:87","1.1.1.1","87"],["1.1.1.1:88","1.1.1.1","88"],["1.1.1.1:89","1.1.1.1","89"],["1.1.1.1:90","1.1.1.1","90"]]}`))
}

return
case "ip,port,body":
// Test for body field with batchSize auto-cap
switch r.FormValue("size") {
case "10":
w.Write([]byte(`{"error":false,"size":470293950,"page":1,"mode":"extended","query":"port=\"80\"","results":[["94.130.128.248","80","<html>test body 1</html>"],["186.6.19.151","80","<html>test body 2</html>"],["72.247.70.195","80","<html>test body 3</html>"],["18.66.199.67","80","<html>test body 4</html>"],["91.122.52.148","80","<html>test body 5</html>"],["113.23.57.252","80","<html>test body 6</html>"],["54.144.154.222","80","<html>test body 7</html>"],["188.223.2.247","80","<html>test body 8</html>"],["50.213.108.254","80","<html>test body 9</html>"],["34.237.16.144","80","<html>test body 10</html>"]]}`))
case "100":
w.Write([]byte(`{"error":false,"size":470293950,"page":1,"mode":"extended","query":"port=\"80\"","results":[["94.130.128.248","80","<html>body</html>"],["186.6.19.151","80","<html>body</html>"],["72.247.70.195","80","<html>body</html>"],["18.66.199.67","80","<html>body</html>"],["91.122.52.148","80","<html>body</html>"],["113.23.57.252","80","<html>body</html>"],["54.144.154.222","80","<html>body</html>"],["188.223.2.247","80","<html>body</html>"],["50.213.108.254","80","<html>body</html>"],["34.237.16.144","80","<html>body</html>"]]}`))
default:
w.Write([]byte(`{"error":false,"size":470293950,"page":1,"mode":"extended","query":"port=\"80\"","results":[["94.130.128.248","80","<html>body</html>"]]}`))
}
return
case "ip,port,host":
// Test for non-body fields
switch r.FormValue("size") {
case "10":
w.Write([]byte(`{"error":false,"size":470293950,"page":1,"mode":"extended","query":"port=\"80\"","results":[["94.130.128.248","80","94.130.128.248:80"],["186.6.19.151","80","186.6.19.151:80"],["72.247.70.195","80","72.247.70.195:80"],["18.66.199.67","80","18.66.199.67:80"],["91.122.52.148","80","91.122.52.148:80"],["113.23.57.252","80","113.23.57.252:80"],["54.144.154.222","80","54.144.154.222:80"],["188.223.2.247","80","188.223.2.247:80"],["50.213.108.254","80","50.213.108.254:80"],["34.237.16.144","80","34.237.16.144:80"]]}`))
default:
w.Write([]byte(`{"error":false,"size":470293950,"page":1,"mode":"extended","query":"port=\"80\"","results":[["94.130.128.248","80","94.130.128.248:80"]]}`))
}
return
}
case "port=5354":
Expand Down Expand Up @@ -255,6 +275,8 @@ var (
case "host,ip,port,protocol":
data = append([]string{fmt.Sprintf("http://%d.%d.%d.%d", i, i, i, i)}, data...)
data = append(data, "http")
case "ip,port,body":
data = append(data, fmt.Sprintf("<html>body content %d</html>", i+j))
}

results = append(results, data)
Expand Down
4 changes: 2 additions & 2 deletions cmd/fofa/cmd/dump.go
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,8 @@ var dumpCmd = &cli.Command{
&cli.IntFlag{
Name: "batchSize",
Aliases: []string{"bs"},
Value: 1000,
Usage: "the amount of data contained in each batch",
Value: 0,
Usage: "the amount of data contained in each batch, default 1000",
Destination: &batchSize,
},
&cli.StringFlag{
Expand Down
8 changes: 8 additions & 0 deletions cmd/fofa/cmd/search.go
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,13 @@ var searchCmd = &cli.Command{
Usage: "search result for over a year",
Destination: &full,
},
&cli.IntFlag{
Name: "batchSize",
Aliases: []string{"bs"},
Value: 0,
Usage: "amount of data contained in each page batch, default 1000",
Destination: &batchSize,
},
&cli.BoolFlag{
Name: "uniqByIP",
Value: false,
Expand Down Expand Up @@ -351,6 +358,7 @@ func SearchAction(ctx *cli.Context) error {
DeWildcard: deWildcard,
Filter: filter,
DedupHost: dedupHost,
BatchSize: batchSize,
})
if err != nil {
return err
Expand Down
55 changes: 50 additions & 5 deletions host.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,13 @@ import (
"encoding/base64"
"errors"
"fmt"
"github.com/Knetic/govaluate"
"github.com/expr-lang/expr"
"log"
"math"
"strconv"
"strings"

"github.com/Knetic/govaluate"
"github.com/expr-lang/expr"
)

const (
Expand Down Expand Up @@ -55,6 +57,7 @@ type SearchOptions struct {
DeWildcard int // number of wildcard domains retained
Filter string // filter data by rules
DedupHost bool // prioritize subdomain data retention
BatchSize int // custom batch size
}

// fixHostToUrl 替换host为url
Expand Down Expand Up @@ -226,11 +229,34 @@ func (c *Client) HostSearch(query string, size int, fields []string, options ...
}

page := 1
perPage := int(math.Min(float64(size), 1000)) // 最多一次取1000

// 一次取所有数据,perPage 默认给 1000
maxPerPage := 1000
userSetBatchSize := false
if len(options) > 0 && options[0].BatchSize > 0 {
maxPerPage = options[0].BatchSize
userSetBatchSize = true
if maxPerPage > 10000 {
maxPerPage = 10000 // /search/all api limit
}
}

for _, f := range fields {
if f == "body" {
if maxPerPage > 500 && !userSetBatchSize {
maxPerPage = 500
if c.logger != nil {
c.logger.Warnf("fields contains body, change batchSize to %d", maxPerPage)
}
}
break
}
}

perPage := int(math.Min(float64(size), float64(maxPerPage))) // 最多一次取 maxPerPage

// 一次取所有数据,perPage 默认给 maxPerPage
if size == -1 {
perPage = 1000
perPage = maxPerPage
}

hostIndex, protocolIndex, fields, rawFieldSize, err := c.fixUrlCheck(fields, options...)
Expand Down Expand Up @@ -397,6 +423,7 @@ func (c *Client) HostSearch(query string, size int, fields []string, options ...
}

res = append(res, results...)
log.Printf("size: %d for query: %s", len(res), query)

// 数据填满了,完成
if size != -1 && size <= len(res) {
Expand Down Expand Up @@ -489,10 +516,28 @@ func (c *Client) DumpSearch(query string, allSize int, batchSize int, fields []s

next := ""
perPage := batchSize
userSetBatchSize := true
if perPage == 0 {
perPage = 1000
userSetBatchSize = false
}

if perPage < 1 || perPage > 100000 {
return errors.New("batchSize must between 1 and 100000")
}

for _, f := range fields {
if f == "body" {
if perPage > 500 && !userSetBatchSize {
perPage = 500
if c.logger != nil {
c.logger.Warnf("fields contains body, change batchSize to %d", perPage)
}
}
break
}
}

// 确保urlfix开启后带上了protocol字段
hostIndex, protocolIndex, fields, rawFieldSize, err := c.fixUrlCheck(fields, options...)
if err != nil {
Expand Down
Loading
Loading