Lemmy server admins, please enable pg_stat_statements on your PostgreSQL server so the Lemmy community can better identify the performance problems on the more active sites
I have been working with pg_stat_statements extension to PG and it give us a way to see the actual SQL statements being executed by lemmy_server and the number of times they are being called.
This has less overhead than cranking up logging and several cloud computing services enable it by default (example) - so I don't believe it will have a significant slow down of the server.
A DATABASE RESTART WILL BE REQUIRED
It does require that PostgreSQL be restarted. Which can take 10 or 15 seconds, typically.
Interesting, over here at programmer.dev this shows "10 comments" when I don't think there were ever that many, are edits being counted as new comments? I did edit postings several times.
Some major stored procedure SQL problems were overlooked until 2 days ago. I submitted revised statements to fix a massive write operation on every single post and comment creation on a local site. The site_aggregates table... every row modified instead of a single row. https://github.com/LemmyNet/lemmy/pull/3704
Once the extension is installed and enabled, you can reset the stats whenever you want and it will start to build a new list of queries and I assume reset the average execution time.
I found a curious query that seems to take 10 or 11 seconds on my server:
SELECT "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "person_aggregates"."id", "person_aggregates"."person_id", "person_aggregates"."post_count", "person_aggregates"."post_score", "person_aggregates"."comment_count", "person_aggregates"."comment_score" FROM ("person" INNER JOIN "person_aggregates" ON ("person_aggregates"."person_id" = "person"."id"))
WHERE (("person"."admin" = $1) AND ("person"."deleted" = $2)) ORDER BY "person"."published"
This query is being called over 1000 times an hour with my instance only really doing incoming Federation content. Should this be optimized by doing a subSELECT on the WHERE clause before doing the INNER JOIN?
IMPORTANT NOTE: I am rusty on my PostgreSQL, it's been over 10 years since I designed and ran a major time-critical website with it. That is not "SECONDS", it is "MILLISECONDS", but it is still a slow query being called frequently.
Ok, re-reading the documentation again, I made a major error interpreting these results.
mean_exec_time double precision: Mean time spent executing the statement, in milliseconds
All my statements about INSERT on Upvotes taking 1/3 of a second are wrong, it's less than 1 millisecond. Although it sure doesn't feel that fast when you are interactively using Lemmy 0.18 and pressing the Vote buttons, it seems rather sluggish. I've almost never seen fractions of a milliseconds, but here it is.
This query is asking the database to slice into a string in ORDER BY and in WHERE criteria:
SELECT "comment"."id", "comment"."creator_id", "comment"."post_id", "comment"."content", "comment"."removed", "comment"."published", "comment"."updated", "comment"."deleted", "comment"."ap_id", "comment"."local", "comment"."path", "comment"."distinguished", "comment"."language_id", "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "post"."id", "post"."name", "post"."url", "post"."body", "post"."creator_id", "post"."community_id", "post"."removed", "post"."locked", "post"."published", "post"."updated", "post"."deleted", "post"."nsfw", "post"."embed_title", "post"."embed_description", "post"."embed_video_url", "post"."thumbnail_url", "post"."ap_id", "post"."local", "post"."language_id", "post"."featured_community", "post"."featured_local", "community"."id", "community"."name", "community"."title", "community"."description", "community"."removed", "community"."published", "community"."updated", "community"."deleted", "community"."nsfw", "community"."actor_id", "community"."local", "community"."icon", "community"."banner", "community"."hidden", "community"."posting_restricted_to_mods", "community"."instance_id", "comment_aggregates"."id", "comment_aggregates"."comment_id", "comment_aggregates"."score", "comment_aggregates"."upvotes", "comment_aggregates"."downvotes", "comment_aggregates"."published", "comment_aggregates"."child_count", "comment_aggregates"."hot_rank", "community_person_ban"."id", "community_person_ban"."community_id", "community_person_ban"."person_id", "community_person_ban"."published", "community_person_ban"."expires", "community_follower"."id", "community_follower"."community_id", "community_follower"."person_id", "community_follower"."published", "community_follower"."pending", "comment_saved"."id", "comment_saved"."comment_id", "comment_saved"."person_id", "comment_saved"."published", "person_block"."id", "person_block"."person_id", "person_block"."target_id", "person_block"."published", "comment_like"."score"
FROM ((((((((((("comment" INNER JOIN "person" ON ("comment"."creator_id" = "person"."id")) INNER JOIN "post" ON ("comment"."post_id" = "post"."id")) INNER JOIN "community" ON ("post"."community_id" = "community"."id")) INNER JOIN "comment_aggregates" ON ("comment_aggregates"."comment_id" = "comment"."id")) LEFT OUTER JOIN "community_person_ban" ON ((("community"."id" = "community_person_ban"."community_id") AND ("community_person_ban"."person_id" = "comment"."creator_id")) AND (("community_person_ban"."expires" IS NULL) OR ("community_person_ban"."expires" > CURRENT_TIMESTAMP)))) LEFT OUTER JOIN "community_follower" ON (("post"."community_id" = "community_follower"."community_id") AND ("community_follower"."person_id" = $1))) LEFT OUTER JOIN "comment_saved" ON (("comment"."id" = "comment_saved"."comment_id") AND ("comment_saved"."person_id" = $2))) LEFT OUTER JOIN "person_block" ON (("comment"."creator_id" = "person_block"."target_id") AND ("person_block"."person_id" = $3))) LEFT OUTER JOIN "community_block" ON (("community"."id" = "community_block"."community_id") AND ("community_block"."person_id" = $4))) LEFT OUTER JOIN "comment_like" ON (("comment"."id" = "comment_like"."comment_id") AND ("comment_like"."person_id" = $5))) LEFT OUTER JOIN "local_user_language" ON (("comment"."language_id" = "local_user_language"."language_id") AND ("local_user_language"."local_user_id" = $6)))
WHERE ((("community"."hidden" = $7) OR ("community_follower"."person_id" = $8)) AND (nlevel("comment"."path") <= $9)) ORDER BY subpath("comment"."path", $10, $11), "comment_aggregates"."hot_rank" DESC LIMIT $12 OFFSET $13
the nlevel on comment.path field <= and the ORDER BY subpath on the same field are really dragging down this query.
I am still trying to figure out the circumstances on the lemmy-ui webapp where this particular 13 parameter query is triggered. A GitHub grep of the source code shows 'subpath' is only used in a single place.