share_log

リコー、モデルマージによってGPT-4と同等の高性能な日本語LLM(700億パラメータ)を開発

Ricoh has developed a high-performance Japanese LLM (70 billion parameters) equivalent to GPT-4 through model merging.

RICOH COMPANY ·  Sep 29 23:00

Ricoh Co., Ltd. (President and CEO: Akira Oyama) has developed a high-performance Japanese large-scale language model (LLM*) by improving the Japanese performance of "Meta-Llama-3-70B" provided by Meta Platforms Inc. to the base model "Llama-3-Swallow-70B*1", extracting vectors from the company's Instruct model, and merging the Chat Vectors*2 extracted from Ricoh and Chat Vectors*3 produced by Ricoh with Ricoh's unique expertise. As a result, Ricoh has added a high-performance model equivalent to GPT-4 developed by OpenAI to the lineup of LLM developed and provided by Ricoh.

The increasing spread of generative AI has led to a growing demand for high-performance LLM that companies can utilize in their operations. However, there is a challenge that additional learning of LLM is costly and time-consuming. In response to this challenge, the efficient development method of combining multiple models to create a higher-performance model, known as "Model Merge*5", is gaining attention.

Based on the expertise in model merging and the knowledge of LLM development, Ricoh has developed a new LLM. This technology contributes to streamlining the development of private LLMs unique to companies and high-performance LLMs for specific operations.

Ricoh will continue to promote research and development of diverse and efficient methods and technologies in order to not only develop its own LLMs but also to provide optimal LLMs tailored to customers' applications and environments at low cost and short delivery times.

Evaluation Results*6 (ELYZA-tasks-100)

"ELYZA-tasks-100", a representative Japanese benchmark including complex instructions and tasks, Ricoh's LLM developed using the model merge method in this instance showed a high level of score equivalent to GPT-4. Furthermore, while other LLMs compared showed cases where answers were given in English depending on the task, Ricoh's LLM consistently provided responses in Japanese for all tasks showing high stability.

bigComparison Results with Other Models in Benchmark Tool (ELYZA-tasks-100) (Ricoh at the bottom)

Background of Ricoh's LLM development

Against the backdrop of declining labor force and aging population, the use of AI to improve productivity and create high-value working styles has become a challenge for corporate growth. As a means to address this challenge, many companies are focusing on the practical application of AI in their operations. However, in order to apply AI to actual operations, it is necessary to train LLM with large amounts of text data including company-specific terminology and phrasing, to create their own AI model (Private LLM).

Leveraging its top-class LLM development and learning technology in Japan, Ricoh is capable of proposing various AI solutions such as providing private LLM for enterprises and supporting the utilization of internal documents through the introduction of RAG.

*1Llama-3-Swallow-70B: Japanese LLM model developed by a research team including Professors Naokazu Okazaki and Rio Yokota from the Department of Information Engineering at the Tokyo Institute of Technology, and the National Institute of Advanced Industrial Science and Technology (AIST). *2Chat Vector: A vector extracted from a model with instruction-following capabilities, by subtracting the base model's weights and retaining only the instruction-following capabilities. *3Ricoh's Chat Vector: Extracted from an Instruct model that was finetuned with approximately 0.01 million 6 thousand instruction tuning data, including Ricoh's proprietary development, based on Meta's base model "Meta-Llama-3-70B". *4Large Language Model (LLM): A technology that enables understanding of the ambiguity and fluctuations present in natural language spoken or written by humans, considering the context even between distant words in a sentence. This technology allows for processing that takes into account the "context," can execute tasks such as "answering questions about natural language sentences" and "summarizing documents" with human-like accuracy, and is easy to train. *5Model Merging: A new method combining multiple pretrained LLM models to create a higher-performing model. It has gained attention in recent years for making model development more accessible without the need for large-scale computational resources like GPUs. *62024 Evaluation Results as of September 24: The evaluation was conducted using "GPT-4" (gpt-4-0613) and "GPT-4o" (gpt-4o-2024-05-13) for generating text scores, without deductions for responses in English. The "Percentage of tasks answered in English" indicates the proportion of tasks out of 100 that were answered in English.

Related News

  • Ricoh has developed a large-scale language model (LLM) with 70 billion parameters that supports three languages: Japanese, English, and Chinese, and strengthens support for customers' private LLM construction.
  • Developed a Japanese LLM with 13 billion pre-tuned parameters
  • Developed a high-precision Japanese large language model (LLM) with 13 billion parameters

This news release is also available as a PDF file.

Ricoh has developed a high-performance Japanese LLM (70 billion parameters) equivalent to GPT-4 through Model Merge (224KB, 2 pages in total).

* The company name and product name are trademarks or registered trademarks of each company.

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment