diff options
author | heqnx <root@heqnx.com> | 2025-05-23 13:59:42 +0300 |
---|---|---|
committer | heqnx <root@heqnx.com> | 2025-05-23 13:59:42 +0300 |
commit | e066414731c04564d7bf1c3bf756e229f6424d55 (patch) | |
tree | 2369e3025432f659a20fc7075c5009ee500a6aad /README.md | |
parent | cf62223b636bdfb1b0c1de5f54cca69b302c0031 (diff) | |
download | go-linkfinder-e066414731c04564d7bf1c3bf756e229f6424d55.tar.gz go-linkfinder-e066414731c04564d7bf1c3bf756e229f6424d55.zip |
added go-linkfinder
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 114 |
1 files changed, 114 insertions, 0 deletions
diff --git a/README.md b/README.md new file mode 100644 index 0000000..4842136 --- /dev/null +++ b/README.md @@ -0,0 +1,114 @@ +# go-linkfinder + +`go-linkfinder` is a simple web crawling and link extraction tool written in Go, inspired by [https://github.com/GerbenJavado/LinkFinder](https://github.com/GerbenJavado/LinkFinder). It supports fetching content from URLs or files, recursively crawling same-domain links up to a specified depth, and customizing request headers, user-agent, timeout, and delay between requests. + +> **WARNING**: This tool is intended for **authorized security testing and web research only**. Unauthorized crawling may violate website terms of service or legal regulations. The author and contributors are not responsible for misuse. Always obtain explicit permission before crawling any site. + +## Features + +- **Input Sources**: Fetch content from URLs or local files. +- **Recursive Crawling**: Configurable recursion depth to crawl same-domain links. +- **Request Customization**: Set HTTP headers and user-agent strings. +- **Timeout Control**: Customize HTTP request timeout. +- **Delay Between Requests**: Configurable delay (default 4 seconds) to avoid overloading servers. +- **Automatic Gzip Handling**: Supports gzip compressed HTTP responses. +- **URL Normalization**: Resolves relative URLs to absolute form. +- **Duplicate Detection**: Tracks and avoids revisiting URLs. + +## Installation + +### Prerequisites + +- **Go**: Version 1.21 or later. +- **Make**: For building with the provided Makefile. +- **Git**: To clone the repository. + +### Steps + +- Clone the repository: + +``` +$ git clone https://cgit.heqnx.com/go-linkfinder +$ cd go-linkfinder +``` + +- Install dependencies: + +``` +$ go mod tidy +``` + +- Build for all platforms: + +``` +$ make all +``` + +- Binaries will be generated in the build/ directory for Linux, Windows, and macOS; alternatively, build for a specific platform: + +``` +$ make linux-amd64 +$ make windows-amd64 +$ make darwin-arm64 +``` + +- (Optional) Run directly with Go: + +``` +$ go run main.go -depth <depth> -delay <delay> -header <Header: Value> -input <url/file> -timeout <timeout> -user-agent <ua> +``` + +## Usage + +### Command-Line Flags + +``` +Usage of ./go-linkfinder-linux-amd64: + -depth int + recursion depth for same-domain links (0 disables crawling) (default 0) + -delay int + delay between requests in seconds when crawling (only applies if depth > 0) (default 4) + -header value + add HTTP header to request (can be repeated, e.g. -header "Authorization: Bearer token") + -input string + url or file path (required) + -timeout int + timeout for HTTP requests in seconds (default 10) + -user-agent string + set User-Agent header (default "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.10 Safari/605.1.1") +``` + +## Examples + +### Crawl a single URL without recursion + +``` +$ ./go-linkfinder -input https://example.com -depth 0 +``` + +- Extracts and prints all links found on the initial page only. + +### Recursively crawl same-domain links with delay + +``` +$ ./go-linkfinder -input https://example.com -depth 2 -delay 5 +``` + +- Crawls links up to 2 levels deep within the same domain. + +- Waits 5 seconds between each HTTP request to avoid server overload. + +### Use Custom Headers and User-Agent + +``` +$ ./go-linkfinder -input https://example.com -depth 1 -header "Authorization: Bearer mytoken" -user-agent "CustomAgent/1.0" +``` + +## License + +This project is licensed under the MIT License. See the LICENSE file for details. + +## Disclaimer + +`go-linkfinder` is provided "as is" without warranty. The author and contributors are not liable for any damages or legal consequences arising from its use. Use responsibly and only in authorized environments. + |