K2-Shanghai Jiao Tong University
K2 is an open-source large language model designed specifically for the geosciences. It is based on the LLaMA architecture and is first pre-trained in the domain on a large amount of geoscience literature (including open access papers and Wikipedia), and then fine-tuned using the knowledge-intensive instruction data GeoSignal. In the GeoBench benchmark evaluation consisting of NPEE and AP Geology, Geography, and Environmental Science tests, K2 outperforms similar parameter-scaled baseline models on a number of objective and subjective tasks. The project will open source the associated code and data.