Hi there! Thanks for your wonderful work. It is indeed an efficient way to determine whether certain text is written in Mandarin or Cantonese.
Recently I'm dealing with a bunch of crawled lyrics, most of which are mandarin songs but there do exist a few cantonese songs. I would like to try some fast and efficient ways to distinguish them, and I'm wondering if this algorithm has been trained on corpus including lyrics and how well it performs on this lyrics language classification task?
Thanks for your time!
Hi there! Thanks for your wonderful work. It is indeed an efficient way to determine whether certain text is written in Mandarin or Cantonese.
Recently I'm dealing with a bunch of crawled lyrics, most of which are mandarin songs but there do exist a few cantonese songs. I would like to try some fast and efficient ways to distinguish them, and I'm wondering if this algorithm has been trained on corpus including lyrics and how well it performs on this lyrics language classification task?
Thanks for your time!