Alternate view:
It's not what it is, it's what you do with it.
Text is expected to be formatted in device-specific ways. Excess whitespace may be omitted entirely (e.g., HTML, C, etc.), expanded (e.g., typesetting?), or translated (e.g., utilizing Unicode spaces and joiners, or stripping them as the case may be). Control characters have specified purposes, and also may be translated or removed as needed (e.g., *nix family CR vs. DOS family CR-LF sequences). Everything else is given specific textual representation -- i.e., the graphic letters I am presently typing.
Most of these features are implemented by operating systems, sometimes transparently (e.g., opening the CON device as a text file). If you're writing your own, say, serial terminal / emulator, you must implement these as well (the whole point of a terminal is at least basic text formatting and control, if not full ANSI or VT100 or whatever operating modes).
Whereas, "binary" must be inscrutable and untouchable. Any accidental change of bits or bytes in the file will likely corrupt it for its intended purpose, and it must always be transferred wholly intact.
Now, it might well be that there's a great many ways a particular binary file could be changed, while remaining equivalent in some useful way to the original. Examples: EXIF data in JPEG and various other formats; the number and size of chunks in a PNG file; the headers and memory mapping in an EXE file; etc. But there are so many formats out there that assuming any one of them is a bad idea. So we just call it "binary" and keep our hands off it.
You wouldn't want your OS going in and recompressing your images willy-nilly, would you? (Mind, a lot of hosting servers do this for you, and much more -- beware!)
So -- text can always be treated as binary, but the converse is not true. Text implies a format so standard (i.e., ASCII) that everyone can make the same assumptions about it (printable characters, variable space, meaningful control characters, etc.).
As noted above, these definitions don't need to be exclusive. A hex file isn't hex as such (i.e., digits restricted to 0-15, in packed or unpacked bytes say), but it's ASCII coded. Guess I'd say hex is a subset of text, and text is a subset of general binary files. Don't forget there's always polyglot formats -- someone's devised a plain text version of x86 machine code, what might be considered simultaneously both binary and ASCII. (Emphasis on binary though, as almost any change in the file will likely corrupt the executable part.)
Tim