Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: rebuild legacy rank and store #45

Merged
merged 2 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions checks/checks.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,13 @@
import (
"net/http"
"time"

"github.com/xray-web/web-check-api/checks/store/legacyrank"
)

type Checks struct {
Carbon *Carbon
LegacyRank *LegacyRank
Rank *Rank
SocialTags *SocialTags
Tls *Tls
Expand All @@ -18,6 +21,7 @@
}
return &Checks{
Carbon: NewCarbon(client),
LegacyRank: NewLegacyRank(legacyrank.NewInMemoryStore()),

Check warning on line 24 in checks/checks.go

View check run for this annotation

Codecov / codecov/patch

checks/checks.go#L24

Added line #L24 was not covered by tests
Rank: NewRank(client),
SocialTags: NewSocialTags(client),
Tls: NewTls(client),
Expand Down
27 changes: 27 additions & 0 deletions checks/legacy_rank.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
package checks

import "github.com/xray-web/web-check-api/checks/store/legacyrank"

type DomainRank struct {
Domain string `json:"domain"`
Rank int `json:"rank"`
}

type LegacyRank struct {
data legacyrank.Getter
}

func NewLegacyRank(lrg legacyrank.Getter) *LegacyRank {
return &LegacyRank{data: lrg}
}

func (lr *LegacyRank) LegacyRank(domain string) (*DomainRank, error) {
rank, err := lr.data.GetLegacyRank(domain)
if err != nil {
return nil, err

Check warning on line 21 in checks/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/legacy_rank.go#L21

Added line #L21 was not covered by tests
}
return &DomainRank{
Domain: domain,
Rank: rank,
}, nil
}
23 changes: 23 additions & 0 deletions checks/legacy_rank_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
package checks

import (
"testing"

"github.com/stretchr/testify/assert"
"github.com/xray-web/web-check-api/checks/store/legacyrank"
)

func TestLegacyRank(t *testing.T) {
t.Parallel()

t.Run("get rank", func(t *testing.T) {
t.Parallel()
lr := NewLegacyRank(legacyrank.GetterFunc(func(domain string) (int, error) {
return 1, nil
}))
dr, err := lr.LegacyRank("example.com")
assert.NoError(t, err)
assert.Equal(t, 1, dr.Rank)
assert.Equal(t, "example.com", dr.Domain)
})
}
99 changes: 99 additions & 0 deletions checks/store/legacyrank/legacy_rank.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
package legacyrank

import (
"archive/zip"
"bytes"
"context"
"encoding/csv"
"errors"
"io"
"log"
"net/http"
"strconv"
"sync"
"time"
)

var ErrNotFound = errors.New("domain not found")

type Getter interface {
GetLegacyRank(domain string) (int, error)
}

type GetterFunc func(domain string) (int, error)

func (f GetterFunc) GetLegacyRank(domain string) (int, error) {
return f(domain)

Check warning on line 26 in checks/store/legacyrank/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/store/legacyrank/legacy_rank.go#L25-L26

Added lines #L25 - L26 were not covered by tests
}

type InMemoryStore struct{}

var once sync.Once
var data map[string]int //map of domain to rank
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be encapsulated in the InMemoryStore?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could but the once download would be per instance of memory store instead of once when the module is imported, the data would need to be at the handler level and as this is just a temporary solution, i didnt want to put it at handler level


func NewInMemoryStore() *InMemoryStore {
return &InMemoryStore{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we return the actual InMemoryStore, instead of a pointer to it? Or would that just be unnecessary here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointer was as i intended to originally to have the data inside the struct. but then moved it to a package with a global for the one off load, either would work here, the difference is negligible although strictly speaking non pointer would be a tiny bit faster

}

func (s *InMemoryStore) GetLegacyRank(url string) (int, error) {
once.Do(func() {
var err error
data, err = load()
if err != nil {
log.Println(err)

Check warning on line 43 in checks/store/legacyrank/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/store/legacyrank/legacy_rank.go#L43

Added line #L43 was not covered by tests
}
})

rank, ok := data[url]
if !ok {
return -1, ErrNotFound

Check warning on line 49 in checks/store/legacyrank/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/store/legacyrank/legacy_rank.go#L49

Added line #L49 was not covered by tests
}
return rank, nil
}

func load() (map[string]int, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be able to be split up a bit smaller

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i dont like the the fact it needs to be loaded into memory first, difference of taking in a Reader and ReaderAt with the zip reader, its possible to stream a zip file. but not with the native APIs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe around error handling too. As it's beeing logged but looks like it's not returned to the caller.

Maybe error handling is something i can think about at a more global level, I'm not sure how that typically works with Go, I'd need to look into it. In the JS world, I'd have a reusable error class/function, which is called whenever there's an error. So that error handling is consistent and observability/monitoring can just be implemented in one place.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Erros in golang are handled when the happen not at global level.
Errors are always bubbled up. here errors are returned to the caller which in this case logs the error.

ctx, cancel := context.WithTimeout(context.Background(), time.Second*10)
defer cancel()
req, err := http.NewRequestWithContext(ctx, http.MethodGet, "https://s3-us-west-1.amazonaws.com/umbrella-static/top-1m.csv.zip", nil)
if err != nil {
return nil, err

Check warning on line 59 in checks/store/legacyrank/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/store/legacyrank/legacy_rank.go#L59

Added line #L59 was not covered by tests
}
client := &http.Client{
Timeout: time.Second * 10,
}
resp, err := client.Do(req)
if err != nil {
return nil, err

Check warning on line 66 in checks/store/legacyrank/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/store/legacyrank/legacy_rank.go#L66

Added line #L66 was not covered by tests
}
defer resp.Body.Close()
b, err := io.ReadAll(resp.Body)
if err != nil {
return nil, err

Check warning on line 71 in checks/store/legacyrank/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/store/legacyrank/legacy_rank.go#L71

Added line #L71 was not covered by tests
}
zf, err := zip.NewReader(bytes.NewReader(b), int64(len(b)))
if err != nil {
return nil, err

Check warning on line 75 in checks/store/legacyrank/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/store/legacyrank/legacy_rank.go#L75

Added line #L75 was not covered by tests
}
f, err := zf.Open("top-1m.csv")
if err != nil {
return nil, err

Check warning on line 79 in checks/store/legacyrank/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/store/legacyrank/legacy_rank.go#L79

Added line #L79 was not covered by tests
}
defer f.Close()
r := csv.NewReader(f)
data := make(map[string]int)
for {
record, err := r.Read()
if err == io.EOF {
break
}
if err != nil {
return nil, err

Check warning on line 90 in checks/store/legacyrank/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/store/legacyrank/legacy_rank.go#L90

Added line #L90 was not covered by tests
}
rank, err := strconv.Atoi(record[0])
if err != nil {
return nil, err

Check warning on line 94 in checks/store/legacyrank/legacy_rank.go

View check run for this annotation

Codecov / codecov/patch

checks/store/legacyrank/legacy_rank.go#L94

Added line #L94 was not covered by tests
}
data[record[1]] = rank
}
return data, nil
}
26 changes: 26 additions & 0 deletions checks/store/legacyrank/legacy_rank_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
package legacyrank_test

import (
"testing"

"github.com/stretchr/testify/assert"
"github.com/xray-web/web-check-api/checks/store/legacyrank"
)

func TestInMemoryStore(t *testing.T) {
t.Parallel()

t.Run("get google rank", func(t *testing.T) {
t.Parallel()
ims := legacyrank.NewInMemoryStore()
dr, err := ims.GetLegacyRank("google.com")
assert.NoError(t, err, dr)
})

t.Run("get microsoft rank", func(t *testing.T) {
t.Parallel()
ims := legacyrank.NewInMemoryStore()
dr, err := ims.GetLegacyRank("microsoft.com")
assert.NoError(t, err, dr)
})
}
2 changes: 0 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ require (
github.com/gobwas/httphead v0.1.0 // indirect
github.com/gobwas/pool v0.2.1 // indirect
github.com/gobwas/ws v1.4.0 // indirect
github.com/h2non/parth v0.0.0-20190131123155-b4df798d6542 // indirect
github.com/josharian/intern v1.0.0 // indirect
github.com/kr/pretty v0.3.1 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
Expand All @@ -28,6 +27,5 @@ require (
github.com/stretchr/testify v1.9.0
golang.org/x/net v0.25.0
golang.org/x/sys v0.20.0 // indirect
gopkg.in/h2non/gock.v1 v1.1.2
gopkg.in/yaml.v3 v3.0.1 // indirect
)
6 changes: 0 additions & 6 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,6 @@ github.com/gobwas/pool v0.2.1/go.mod h1:q8bcK0KcYlCgd9e7WYLm9LpyS+YeLd8JVDW6Wezm
github.com/gobwas/ws v1.3.2/go.mod h1:hRKAFb8wOxFROYNsT1bqfWnhX+b5MFeJM9r2ZSwg/KY=
github.com/gobwas/ws v1.4.0 h1:CTaoG1tojrh4ucGPcoJFiAQUAsEWekEWvLy7GsVNqGs=
github.com/gobwas/ws v1.4.0/go.mod h1:G3gNqMNtPppf5XUz7O4shetPpcZ1VJ7zt18dlUeakrc=
github.com/h2non/parth v0.0.0-20190131123155-b4df798d6542 h1:2VTzZjLZBgl62/EtslCrtky5vbi9dd7HrQPQIx6wqiw=
github.com/h2non/parth v0.0.0-20190131123155-b4df798d6542/go.mod h1:Ow0tF8D4Kplbc8s8sSb3V2oUCygFHVp8gC3Dn6U4MNI=
github.com/josharian/intern v1.0.0 h1:vlS4z54oSdjm0bgjRigI+G1HpF+tI+9rE5LLzOg8HmY=
github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y=
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
Expand All @@ -33,8 +31,6 @@ github.com/ledongthuc/pdf v0.0.0-20220302134840-0c2507a12d80 h1:6Yzfa6GP0rIo/kUL
github.com/ledongthuc/pdf v0.0.0-20220302134840-0c2507a12d80/go.mod h1:imJHygn/1yfhB7XSJJKlFZKl/J+dCPAknuiaGOshXAs=
github.com/mailru/easyjson v0.7.7 h1:UGYAvKxe3sBsEDzO8ZeWOSlIQfWFlxbzLZe7hwFURr0=
github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc=
github.com/nbio/st v0.0.0-20140626010706-e9e8d9816f32 h1:W6apQkHrMkS0Muv8G/TipAy/FJl/rCYT0+EuS8+Z0z4=
github.com/nbio/st v0.0.0-20140626010706-e9e8d9816f32/go.mod h1:9wM+0iRr9ahx58uYLpLIr5fm8diHn0JbqRycJi6w0Ms=
github.com/orisano/pixelmatch v0.0.0-20220722002657-fb0b55479cde h1:x0TT0RDC7UhAVbbWWBzr41ElhJx5tXPWkIHA2HWPRuw=
github.com/orisano/pixelmatch v0.0.0-20220722002657-fb0b55479cde/go.mod h1:nZgzbfBr3hhjoZnS66nKrHmduYNpc34ny7RK4z5/HM0=
github.com/pkg/diff v0.0.0-20210226163009-20ebb0f2a09e/go.mod h1:pJLUxLENpZxwdsKMEsNbx1VGcRFpLqf3715MtcvvzbA=
Expand Down Expand Up @@ -87,7 +83,5 @@ golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8T
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 h1:YR8cESwS4TdDjEe65xsg0ogRM/Nc3DYOhEAlW+xobZo=
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/h2non/gock.v1 v1.1.2 h1:jBbHXgGBK/AoPVfJh5x4r/WxIrElvbLel8TCZkkZJoY=
gopkg.in/h2non/gock.v1 v1.1.2/go.mod h1:n7UGz/ckNChHiK05rDoiC4MYSunEC/lyaUm2WWaDva0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
Loading