Low-cost call type classification for contact center calls using partial transcripts
Abstract
Call type classification and topic classification for contact center calls using automatically generated transcripts is not yet widely available mainly due to the high cost and low accuracy of call-center grade automatic speech transcription. To address these challenges, we examine if using only partial conversations yields accuracy comparable to using the entire customer-agent conversations. We exploit two interesting characteristics of call center calls. First, contact center calls are highly scripted following prescribed steps, and the customers problem or request (i.e., the determinant of the call type) is typically stated in the beginning of a call. Thus, using only the beginning of calls may be sufficient to determine the call type. Second, agents often more clearly repeat or rephrase what customers said, thus it may be sufficient to process only agents' speech. Our experiments with 1,677 customer calls show that two partial transcripts comprising only the agents utterances and the first 40 speaker turns actually produce slightly higher classification accuracy than a transcript set comprising the entire conversations. In addition, using partial conversations can significantly reduce the cost for speech transcription. Copyright © 2009 ISCA.