Site search algorithm

Hi guys,

Not sure if this is the right category so feel free to move this thread if needed :slight_smile:.

I was wondering if you could explain a bit on how the site search works, in particular the algorithm. How does it sort its results? Is it for example by the number of occurrences of the search query found on a page? Or by date?

Would love to know a bit more so I can share this with my team and our clients.

Cheers,

Floris

2 Likes

hey @floris.tenhove sorry but we need to dig a bit in this, we’ll be in touch when we know a bit more, sorry!

1 Like

Hey @mat_jack1, any updates on this? I’m reluctant to come across too pushy here but as I’ve had a couple of clients asking about this it would be really nice to get some more information from you guys.

Cheers

hi @floris.tenhove! sorry for the delay. I’ll take a look tomorrow and I hope to get through to an answer :slight_smile:

No worries, thank you @faber!

Hi @floris.tenhove :slight_smile:

sorry for the delay. So, the search term is matched against indexed page titles and page bodies.

If term is found in the body it has a score of 1x, while if it is found in the title it has a score of 1.5x. So terms found in page titles have a +50% score than then same term found in in page body.

If the term if found in both page title and page body, both scores are added together and divided by 2.

Results are then sorted by score.

You can inspect the scores by quering our CMA: https://www.datocms.com/docs/content-management-api/resources/search-result/instances .

By default, the endpoint returns 20 search results, but you can ask for more and paginate too.

Hope this helps!

2 Likes

Thank you for this @faber! This is really helpful.