๐Ÿ“ฆ alexandear / article-similarity

DevChallenge XVII. Backend Online Round

โ˜… 0 stars โ‘‚ 0 forks ๐Ÿ‘ 0 watching โš–๏ธ The Unlicense
devchallenge
๐Ÿ“ฅ Clone https://github.com/alexandear/article-similarity.git
HTTPS git clone https://github.com/alexandear/article-similarity.git
SSH git clone git@github.com:alexandear/article-similarity.git
CLI gh repo clone alexandear/article-similarity
Oleksandr Redko Oleksandr Redko Change license to The Unlicense 4ff1f43 5 years ago ๐Ÿ“ History
๐Ÿ“‚ main View all commits โ†’
๐Ÿ“ .github
๐Ÿ“ api
๐Ÿ“ assets
๐Ÿ“ cmd
๐Ÿ“ docs
๐Ÿ“ internal
๐Ÿ“ test
๐Ÿ“ vendor
๐Ÿ“„ .dockerignore
๐Ÿ“„ .editorconfig
๐Ÿ“„ .gitattributes
๐Ÿ“„ .gitignore
๐Ÿ“„ .golangci.yml
๐Ÿ“„ docker-compose.yml
๐Ÿ“„ Dockerfile
๐Ÿ“„ Dockerfile.build
๐Ÿ“„ go.mod
๐Ÿ“„ go.sum
๐Ÿ“„ LICENSE
๐Ÿ“„ main.go
๐Ÿ“„ Makefile
๐Ÿ“„ README.md
๐Ÿ“„ SCALEME.md
๐Ÿ“„ tools.go
๐Ÿ“„ README.md

Article Similarity

HTTP server to store and search similar articles.

Getting started

Run server and storage containers with Compose:

docker-compose up

API is accessible via http://localhost:80/.

API docs

API's description is in the docs/API file.

Additionally, server serves HTML documentation. Run docker-compose up and visit http://localhost:80/docs.

Similarity algorithm

To find similarity between the content of articles used Levenshtein algorithm for words. Before Levenshtein algorithm is applied content preprocessing:

  • remove articles a, an, the and punctuation .,!?-;
  • content separated to word via whitespace characters \t\n\r;
  • replace all irregular verbs to infinitive; irregular verbs are in the file;
assets/irregularverbs.csv.
  • text is lower-cased.
Algorithm works for English content only.

Scalability

See SCALEME file.

Technologies

There are HTTP server written on Golang and mongodb storage.

Development

Prerequisites: docker, docker-compose, go@1.15, make must be installed.

Code style

Consistent code style enforced by gofmt, EditorConfig tools and golangci-lint linter.

Format code:

make format

Run linter:

make lint

Tests

There are unit and integration tests. Unit tests placed in _test.go files, end-to-end in test directory.

Run unit tests:

make test

End-to-end test suite builds server from sources, runs docker-compose up and perform requests to server container. It can be executed:

make test-it

Docker

Build docker image article-similarity:latest:

make docker

Build, run linter and tests in dev docker image article-similarity-dev:latest:

make docker-dev

CI

There are configured GitHub actions for build, lint, run unit and integration tests. See .github/workflows directory.