After excluding English-language /int/ and /international/, we have 815 682 messages of which
Notes:
...according to the joint model. This may be problematic due to the model having seen more messages from the larger corpus (kielipankki).
Perhaps interesting to see the differences between this and the previous.
In these, only messages with a minimum score of 1.9 in the target model were included; otherwise we see messages with a large negative score from the opposite model and near zero in the target one.
Here the average of each thread had to be at least 1.5.