LID Models
pacificLID models: https://www.dropbox.com/sh/od1cfflcbqx5gfw/AACNhqvGmUqo6f4h9ujMYmVua?dl=0
idNet models: https://www.dropbox.com/sh/tr2xmusyp2u47yy/AABOkOXFKVfW77HG0-H-vVHAa?dl=0
Corpora
CGLU v4.2: The Corpus of Global Language Use (423 billion words)
http://www.earthlings.io/download_cglu.html
GeoWAC v1: Geographically-balanced Gigaword Corpora (45 billion words)
http://www.earthlings.io/download_geowac.html
CGLU v3: The Corpus of Global Language Use (16 billion words)
https://publicdata.canterbury.ac.nz/Research/NZILBB/jonathandunn/CGLU_v3/