Dean Professor Tam Kar-Yan (right) and Associate Professor Yang Yi of HKUST Business School announce InvestLM — Hong Kong's first open-source large language model for financial generative AI applications.
A research team at the School of Business and Management of The Hong Kong University of Science and Technology (HKUST Business School) has developed InvestLM — Hong Kong's first open-source large language model (LLM) for financial generative AI (GenAI) applications, capable of generating investment-related, human-like responses comparable to those of well-known commercial chatbots, including OpenAI's ChatGPT. InvestLM’s model parameters[i] and the insights from its development process have been made publicly available to support industry practitioners and researchers in deploying LLM-related technology.
AI-powered natural-language chatbots based on LLMs with billions or even tens of billions of parameters are known for their proficiency in handling a wide range of real-time text-generation tasks. Developing such chat services used to require abundant computing power that is exclusive to very large corporations. This changed when open-source general-purpose LLMs became available earlier this year, which has allowed those with moderate computing resources to train LLMs for their own needs.
By adapting from LLaMA-65B[ii], an open-source general-purpose LLM, with a high-quality and diverse set of finance and investment-related texts[iii] using a technique called instruction fine-tuning[iv], the HKUST research team has developed InvestLM, a state-of-the-art[v] LLM for the financial domain. InvestLM’s responses have been rated as comparable to those of state-of-the-art commercial LLMs, including GPT-3.5, GPT-4, and Claude-2,[vi] by financial experts, such as hedge fund managers and research analysts. This demonstrates InvestLM’s strong capabilities in understanding financial texts, which can potentially enhance the work efficiency of finance and investment professionals in tasks such as providing investment insights, extracting information from and summarizing financial news and reports, according to the research team. Moreover, compared to the foundation model LLaMA-65B, from which InvestLM is adapted, InvestLM exhibits better control in producing responses without hallucinations.
Prof. TAM Kan-Yan, Dean of HKUST Business School, said, “developing LLMs in-house can help financial firms gain competitive edge through the application of generative AI, while retaining better control over proprietary information and customers’ data. Reflecting HKUST’s lead in embracing generative AI among tertiary education sector in Hong Kong, this project has provided valuable insights for the financial sector on leveraging the fast-growing field of generative AI, in addition to making a powerful financial LLM accessible to the public.”
Prof. YANG Yi, Associate Professor of HKUST’s Department of Information Systems, Business Statistics and Operations Management, and a member of the research project team, said, "financial LLMs are either inaccessible due to their proprietary nature, or of low quality. To our knowledge, InvestLM is the first open-source financial domain LLM that can provide insightful responses to investment-related questions, as affirmed by financial professionals. By offering our insights into fine-tuning a foundation model for financial text generation, we hope that InvestLM can serve as a useful reference for industry practitioners in the financial sector and beyond who want to unlock the power of generative AI."
The research team has discovered that applying a diverse set of high-quality, domain-specific instructions to train an LLM is more effective in enhancing its capabilities for handling domain-specific tasks than using a large volume of general-purpose instructions. In cases where computational resources are limited, LLM developers often use smaller LLMs instead of larger LLMs. The team has found that instruction tuning is especially effective in improving the performance of smaller LLMs than larger LLMs.[vii]
More information about InvestLM’s development is available in the research paper titled "InvestLM: A Large Language Model for Investment using Financial Domain Instruction Tuning," which can be viewed at https://arxiv.org/abs/2309.13064. InvestLM’s model parameters can be downloaded at https://github.com/AbaciNLP/InvestLM. [viii]
HKUST Business School Dean Professor Tam Kar-Yan (right) said that the InvestLM project has provided valuable insights for the financial sector on leveraging the fast-growing field of generative AI. Associate Professor Yang Yi of HKUST’s Department of Information Systems, Business Statistics and Operations Management (left) noted that InvestLM is capable of providing insightful responses to investment-related questions, as affirmed by financial professionals.
End
About the HKUST Business School
The School of Business and Management of The Hong Kong University of Science and Technology (HKUST Business School) is young, innovative and committed to advancing global business knowledge. The School has forged an international reputation for world class education programs and research performance, and has received many top global rankings. For more details about the School, please visit https://bm.hkust.edu.hk
For media enquiries, please contact:
HKUST Business School
Danny LEE
Tel: (852) 3469 2090
Email: dannyyklee@ust.hk
[i] Model parameters are numbers that a model learns during training to make text prediction. They are also called weights, and control how the model uses input and produces output for text generation. Model parameters are essential in implementing an LLM-based chatbot service because they are the core components that enable the model to learn from data and perform various natural language processing tasks. Generally speaking, more parameters mean bigger and better models, but also more data and computation.
[ii] LLaMA-65B is a state-of-the-art foundational large language model with 65 billion parameters, developed and released by Meta.
[iii] Dataset used in training InvestLM covers a wide range of financial related topics, from Chartered Financial Analyst (CFA) exam questions , textbooks, academic journals, SEC filings, Stackexchange quantitative finance discussions, financial natural language processing tasks, and investment questions.
[iv] Pre-training and fine-tuning are transfer learning techniques for large language models. Pre-training trains a model on a general text corpus, while fine-tuning adapts it to a specific task or dataset. Pre-training usually takes a long time and requires a lot of computational resources.
[v] In an evaluation of InvestLM's performance vis-a-vis other LLMs, including two instruction tuned models GPT-3.5, GPT-4 from OpenAI, two financial LLMs, BloombergGPT (a 50B foundation model) and FinMA (an instruction-tuned model on LLaMA-7B), and one foundation model LLaMA-65B, upon which InvestLM is built GPT-4 achieves the best performance in 6 out of the 9 tasks, while InvestLM achieves the best performance in 2 out of the 9 tasks, suggesting that GPT-4 is the state-of-the-art commercial LLM.
[vi] The responses of commercial models were obtained in August 2023.
[vii] The relative improvement brought about by domain instruction tuning is considerably more pronounced for the smaller LLAMA-7B, an LLM of 7 billion parameters, compared to the larger LLAMA-65B, an LLM of 65 billion parameters. The results indicate that in scenarios where computational constraints prevent deploying an LLAMA-65B model, domain instruction tuning proves vital in optimizing the performance of the smaller model.
[viii] InvestLM adopts the same licensing term as LLaMA, which are non-commercial and for research only.