Clearly, the photographer doesn't know any phonetics. The reason we say cheese is because the vowel in it is /i:/, which is a close front unrounded vowel. When people say cheese, their lips are spread as if they are smiling.
In contrast, the first syllable of gouda has /u:/, which is a close back rounded vowel. If you take a photo while someone is saying that, their lips will be protruded, as if they are pouting. No wonder the man in the cartoon isn't having much success in getting a good photo!
The other advantage of saying cheese is that it starts and ends with non-labial consonants. (They are produced in the mouth rather than at the lips.) In contrast, parmigiano isn't so good because both /p/ and the /m/ are produced at the lips, so if you clicked at that moment, in the resulting photo it would look like the person was spitting.