Skip to content

Commit 6c364f1

Browse files
feat: add copilot instructions for AI coding agents in webinfo module
1 parent 3d75a15 commit 6c364f1

File tree

1 file changed

+62
-0
lines changed

1 file changed

+62
-0
lines changed

.github/copilot-instructions.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
Purpose
2+
-------
3+
This file gives concise, actionable guidance for AI coding agents working on the `webinfo` Go module.
4+
5+
**What this project does**: Extracts metadata (title, description, canonical, image, etc.) from web pages and provides utilities to fetch and save representative images.
6+
7+
Quick entry points
8+
------------------
9+
- **Primary package**: `webinfo` — key files: `fetch.go` (core `Fetch` function), `webinfo.go` (`Webinfo` struct and `DownloadImage`), `errs.go` (error sentinel values), `fetch_test.go` (behavioral tests).
10+
- **Go module**: `go 1.25` (see `go.mod`).
11+
12+
Developer workflows
13+
-------------------
14+
- Run full CI/test workflow using the Taskfile (recommended if `task` is installed):
15+
- `task test` — runs `go mod verify`, `go test -shuffle on ./...`, `govulncheck`, and `golangci-lint-v2` as configured in `Taskfile.yml`.
16+
- Quick test: `go test ./...` (useful during fast iteration).
17+
- Prepare module: `go mod tidy -v -go=1.25` (mirrors `prepare` in `Taskfile.yml`).
18+
19+
Project-specific conventions and patterns
20+
----------------------------------------
21+
- Error handling: uses `github.com/goark/errs`. Prefer `errs.Wrap(err, errs.WithContext("key", val))` for context-rich errors and `errs.Join` when combining close errors in `defer`.
22+
- HTTP fetching: uses `github.com/goark/fetch`. Typical pattern:
23+
- Parse URL with `fetch.URL(...)`.
24+
- Use `fetch.New(...).GetWithContext(ctx, parsed, fetch.WithRequestHeaderSet("User-Agent", ua))`.
25+
- Default User-Agent: `getUserAgent("")` returns a dummy UA string. Functions accept a `userAgent` param but fall back to this default.
26+
- Encoding: `Fetch` peeks the first 1024 bytes and uses `charset.DetermineEncoding` and `encoding.GetEncoding(name)` to decode response bodies before HTML parsing — preserve this approach when touching parsing logic.
27+
- HTML parsing: `goquery` is used to select head elements and meta tags. Extraction precedence is explicit in `fetch.go` (title → `twitter:title`/`og:title`, description → `twitter:description`/`og:description`, image → `twitter:image`/`og:image`). Follow this precedence in code changes or tests.
28+
- Image download (`DownloadImage` in `webinfo.go`):
29+
- Determines extension from URL path, `Content-Type` header, sniffing (up to 512 bytes), then fallback to `.img`.
30+
- If URL has no filename, `temporary` is forced true and `os.CreateTemp(destDir, "webinfo-image-*"+ext)` is used.
31+
- When sniffing bytes, the code prepends the read bytes back into the stream with `io.MultiReader` so the full image is written.
32+
33+
Tests and examples
34+
------------------
35+
- Tests use `net/http/httptest` for deterministic responses (encoding tests use `golang.org/x/text/encoding/japanese`). Inspect `fetch_test.go` for examples of:
36+
- Redirect handling and validation of `Location`.
37+
- Encoding tests for Shift_JIS and ISO-2022-JP.
38+
- Verifying `User-Agent` header usage.
39+
- Example usage patterns to follow when adding code or tests:
40+
- Fetch: `info, err := Fetch(ctx, "https://example.com", "")` — empty UA uses the default.
41+
- Download image: `outPath, err := w.DownloadImage(ctx, "images", true)`
42+
43+
External dependencies & integration points
44+
----------------------------------------
45+
- Key dependencies in `go.mod`: `github.com/goark/fetch`, `github.com/goark/errs`, `github.com/PuerkitoBio/goquery`, `golang.org/x/text` (encodings).
46+
- The `Taskfile.yml` runs additional tools: `govulncheck`, `golangci-lint-v2`, and (optionally) `nancy` via `depm` — keep CI tool invocations in sync when adding dependencies.
47+
48+
When modifying public APIs
49+
-------------------------
50+
- Maintain existing error-wrapping conventions (`errs.Wrap`, `errs.WithContext`).
51+
- Preserve encoding detection behavior and the 1024-byte peek in `Fetch` unless a clear, tested performance reason exists.
52+
- Preserve `DownloadImage`'s extension-detection order and the behavior of `temporary` vs permanent files.
53+
54+
Where to look next (high-value files)
55+
-------------------------------------
56+
- `fetch.go` — how pages are fetched, decoded and parsed.
57+
- `webinfo.go``Webinfo` type and `DownloadImage` implementation.
58+
- `fetch_test.go` — canonical tests and examples you should mirror for new behaviors.
59+
- `errs.go` and `go.mod` — error constants and dependency hints.
60+
- `Taskfile.yml` — canonical developer/test/lint workflow.
61+
62+
If anything above is unclear or you want more examples (small patches, test templates, or a CI-safe refactor suggestion), tell me which area to expand and I will iterate.

0 commit comments

Comments
 (0)