This code casts each byte to char, ignoring all multi-byte characters. The only reason this works is that the gov2 corpus is mostly ASCII.