Guide: Hide your Lemmy instance from search engines
Due to the nature of the default robots.txt and the meta tags in Lemmy, search engines will index even non-local communities. This leads to results that are undesirable, such as unrelated/undesirable content being associated with your instance.
As of today, lemmy-ui does not allow hiding non-local (or any) communities from Google and other search engines. If you, like me, do not want your instance to be associated with other content, you can add a custom robots.txt and response headers to avoid indexing.
Would it be a better idea to exclude any URLs that are similar to /c/*@*.* I think that would block external communities but keep local ones still indexable in their native locations.