Technically, the ISO (or ASA) rating on a photographic film refers to the "toe", where the emulsion starts to expose. Many transparency films, and proper users of the Zone System, refer instead to "EI", which is in the same units but measures the mid-point of the S curve from blank to exposed, but this depends on the development. ISO is supposed to be a parameter of only the film. Note that the photographic literature rarely discusses this.
Now, for a digital sensor, there is an inherent maximum level, where the charge on each pixel saturates as the diode forward-biases. Typically, a camera sensor saturates at an exposure level that is typical of approximately ISO 100 to 200 film. The very first line-sensor device from Fairchild that I encountered in the early 1970s was specified as equivalent (in that sense) to ISO 100.
The ISO setting in a digital camera is, in fact, the gain from the sensor to the ADC input, as you stated. The quantization noise of the ADC remains constant with respect to the digital word put out, but as you crank up the gain the noise in the dark current of the sensor becomes larger with respect to the full-scale of the ADC and the image appears noisier.
Modern cameras may have built-in tricks to average multiple exposures by computation to improve the SNR, but this requires a constant object and a good tripod.
Exposure is the product of light input rate (determined by the object brightness and the f-number of the lens) and the exposure time (which in a SLR may be determined by a physical shutter).
"f-number" is merely the effective aperture diameter of the lens divided by its focal length, which is why it is written "f/16", where f is the focal length on the aperture scale. Traditionally, a lens has click stops at "1/2 stop", where a "full stop" is a factor of 1.4:1, or an area factor of 2:1. A traditional shutter has click stops at full stops (2:1 in time), and ISO settings on the light meter at "1/3 stops", or 1 dB (in terms of light energy). Again, this last statement confuses some, since twice the light energy develops twice the charge in a digital camera, which is 6 dB at the ADC voltage input.
Another important difference is "reciprocity", as in "reciprocity failure". With film at very long exposures, some of the charge on the photographic grains while the chemical state of a grain is still metastable, and not fully exposed, will drain off, so it takes more total light-induced charge on the grains to achieve the stable "latent image" chemical state. I have not seen any good descriptions on any analogous conditions with photosensors at very long exposure times, but I'm sure the astronomers have carefully looked into this with their cooled sensors.
Once, when using my 8x10 inch view camera downtown, shooting a cityscape, a tourist asked me how many megapixels I had. I did a mental calculation and replied "about 500". Repeating that calculation at home, I found that was approximately correct, although the sampling is random with finite resolution, as opposed to periodic in a digital sensor.