I'm sketching out an idea for a readability assessment program. It will report the education level required to comfortably read a body of text using formulas, Dale-Chall being the most significant, that count length of sentences, what level of vocab a word is considered to be, etc. I was inspired by the word counter website I always paste my essays into. When it's done, I would like to plug it into APIs for it to be used on Lemmy, Mastodon, and Discord.
In the United States, for over a hundred years, the ruling interests tirelessly propagated anticommunism among the populace, until it became more like a religious orthodoxy than a political analysis. During the Cold War, the anticommunist ideological framework could transform any data about existing communist societies into hostile evidence. If the Soviets refused to negotiate a point, they were intransigent and belligerent; if they appeared willing to make concessions, this was but a skillful ploy to put us off our guard. By opposing arms limitations, they would have demonstrated their aggressive intent; but when in fact they supported most armament treaties, it was because they were mendacious and manipulative. If the churches in the USSR were empty, this demonstrated that religion was suppressed; but if the churches were full, this meant the people were rejecting the regime’s atheistic ideology. If the workers went on strike (as happened on infrequent occasions), this was evidence of their alienation from the collectivist system; if they didn’t go on strike, this was because they were intimidated and lacked freedom. A scarcity of consumer goods demonstrated the failure of the economic system; an improvement in consumer supplies meant only that the leaders were attempting to placate a restive population and so maintain a firmer hold over them. If communists in the United States played an important role struggling for the rights of workers, the poor, African-Americans, women, and others, this was only their guileful way of gathering support among disfranchised groups and gaining power for themselves. How one gained power by fighting for the rights of powerless groups was never explained. What we are dealing with is a nonfalsifiable orthodoxy, so assiduously marketed by the ruling interests that it affected people across the entire political spectrum.
-- Michael Parenti, Blackshirts And Reds
I am a bot, and this action was performed automatically. Please contact the admins of this instance if you have any questions or concerns.
BTW, do you guys think I should use databases for this? The one formula uses a list of 4,000 easy words, and storing lists of common proper nouns will help with flagging them. Also, I could probably get vocab level data for tens of thousands of words... better in a DB than a ginormous hash table or trie?
With that small of a dataset imo either option is fine. If it were me I would use an ORM + sqlite just to start, in case I ever needed to migrate to a "real" database.
I am writing in C (the CLI, which I'll just have the bots use) and have never used any databases, would using the sqlite interface straightup with C and some cursory reading of docs be too much, do you think? Course I can switch it all to c++ and then there appears to be at least one nice ORM