|Abstract:||Generative linguistics has primarily used introspection-based data for its theory construction. However, we now witness the rise of experimental approaches to linguistic judgments, in which linguistic judgment patterns are investigated through experimentation. Using patterns of obstruent devoicing in Japanese loanwords as a test case, the current project attempts to contribute to this growing body of work by investigating how different experimental variables affect phonological judgments. The three variables tested in the current experiments are (i) scalar rating judgments vs. binary yes/no judgments, (ii) real words vs. nonce words, and (iii) orthography stimuli vs. audio stimuli. The results show that (i) scalar rating tests and binary yes/no tests show very similar patterns, (ii) nonce words show less variability in acceptability across different grammatical conditions than real words, and (iii) orthography stimuli and audio stimuli yield comparable results, but (iv) audio-based experiments exaggerate the effect of particular phonetic implementation patterns as compared to orthography-based tests. Building on these results, this paper provides some suggestions for future experimentation on phonological judgments.