Maybe it’s good/correct to identify UTF8,16,32 as associated w/ a main type of “text”, but a bit ambiguous or superfluous to label them “plain” — what even is “plain text” ? For any “text”, you always need to know its character encoding to read it, right? I guess the name has to stay for historical/compatibility reasons, though.
But as long as we're basing stuff from the heritage of “MIME” types, should we extend the file signature syntax to allow specifying an extra/optional field? Then you can stick character encoding in there as separate component for “text” types.