Character

Character

Characters can be classified in a variety of ways. One could partition the ascii characters into digits, symbols, upper case letters, lower case letters, and non-printing characters. Unicode is an incredibly robust standard that classifies all characters as well. We’ll simplify our analysis for the purpose of identifiers in programming languages, which are a bit more limited than the extent of all of unicode. Something to loop back to.

Moving forward, we will define a character as one of the printable ascii characters. We can partition these characters in the following classes:

  • The space character (0x20)
  • Digits 0-9 (0x30-0x39)
  • Upper case letters (0x41-0x5A)
  • Lower case letters (0x61-0x7A)
  • Symbols, everything else

Letter(?) Case

Should this be letter case or character case? A “character” is important in terms of programming, but a “letter” is important in terms of linguistic usage. What’s more important? Letter sounds great until I need to start defining edge cases for other no-case characters… Character may not even be accurate either, since in unicode we would call it a codepoint.

The upper and lower case letters both represent the same arabic letter. We say these characters have a letter case of either upper or lower. The remaining characters of space, symbols, and digits do not have a letter case.

name examples
lower r, j, d
upper R, J, D
no case -, 8, ~, space