Semantic Similarity Inference API
Building an inference API for semantic similarity is interesting as soon a you want to use semantic similarity in production. But
keep in mind that building such an API is not necessarily easy. First because you need to code the API
(easy part) but also because you
need to build a highly available, fast, and scalable infrastructure to serve your models behind the hood
(hardest part). Machine learning models consume a lot of resources
(memory, disk space, CPU, GPU...) which makes it hard to achieve high-availability and low latency at
the same time.
Leveraging such an API is very interesting because it is completely decoupled from the rest of your stack
(microservice
architecture), so you can easily scale it independently and ensure high-availability of your models
through redundancy. But an API is also the way to go in terms of language interoperability. Most machine
learning frameworks are developed in Python, but it's likely that you want to
access them from other languages like Javascript, Go, Ruby... In such situation, an API is a great
solution.