|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
探索 llama.cpp 內部結構和基本聊天程式流程
This tutorial will guide you through the process of building a simple C++ program that performs inference on GGUF LLM models using the llama.cpp framework. We will cover the essential steps involved in loading the model, performing inference, and displaying the results. The code for this tutorial can be found here.
本教學將引導您完成建立簡單的 C++ 程式的過程,該程式使用 llama.cpp 框架對 GGUF LLM 模型執行推理。我們將介紹載入模型、執行推理和顯示結果所涉及的基本步驟。本教程的程式碼可以在此處找到。
Prerequisites
先決條件
To follow along with this tutorial, you will need the following:
要學習本教程,您將需要以下內容:
A Linux-based operating system (native or WSL)
基於 Linux 的作業系統(本機或 WSL)
CMake installed
安裝了 CMake
GNU/clang toolchain installed
安裝 GNU/clang 工具鏈
Step 1: Setting Up the Project
第 1 步:設定項目
Let's start by setting up our project. We will be building a C/C++ program that uses llama.cpp to perform inference on GGUF LLM models.
讓我們從設定我們的項目開始。我們將建立一個 C/C++ 程序,使用 llama.cpp 對 GGUF LLM 模型執行推理。
Create a new project directory, let's call it smol_chat.
建立一個新的專案目錄,我們將其命名為 smol_chat。
Within the project directory, let's clone the llama.cpp repository into a subdirectory called externals. This will give us access to the llama.cpp source code and headers.
在專案目錄中,讓我們將 llama.cpp 儲存庫複製到名為 externals 的子目錄中。這將使我們能夠存取 llama.cpp 原始碼和標頭。
mkdir -p externals
mkdir -p 外部文件
cd externals
CD週邊
git clone https://github.com/georgigerganov/llama.cpp.git
git 複製 https://github.com/georgigerganov/llama.cpp.git
cd ..
光碟 ..
Step 2: Configuring CMake
步驟2:配置CMake
Now, let's configure our project to use CMake. This will allow us to easily compile and link our C/C++ code with the llama.cpp library.
現在,讓我們配置我們的專案以使用 CMake。這將使我們能夠輕鬆編譯 C/C++ 程式碼並將其與 llama.cpp 程式庫連結。
Create a CMakeLists.txt file in the project directory.
在專案目錄中建立 CMakeLists.txt 檔案。
In the CMakeLists.txt file, add the following code:
在CMakeLists.txt檔案中,加入以下程式碼:
cmake_minimum_required(VERSION 3.10)
cmake_minimum_required(版本3.10)
project(smol_chat)
項目(smol_chat)
set(CMAKE_CXX_STANDARD 20)
設定(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
設定(CMAKE_CXX_STANDARD_REQUIRED ON)
add_executable(smol_chat main.cpp)
target_include_directories(smol_chat PUBLIC ${CMAKE_CURRENT_SOURCE_DIR})
target_link_libraries(smol_chat llama.cpp)
This code specifies the minimum CMake version, sets the C++ standard and standard flag, adds an executable named smol_chat, includes headers from the current source directory, and links the llama.cpp shared library to our executable.
此程式碼指定最低 CMake 版本,設定 C++ 標準和標準標誌,新增名為 smol_chat 的可執行文件,包含目前來源目錄中的標頭,並將 llama.cpp 共用程式庫連結到我們的執行檔。
Step 3: Defining the LLM Interface
第 3 步:定義 LLM 介面
Next, let's define a C++ class that will handle the high-level interactions with the LLM. This class will abstract away the low-level llama.cpp function calls and provide a convenient interface for performing inference.
接下來,我們定義一個 C++ 類別來處理與 LLM 的高階互動。該類別將抽像出低階 llama.cpp 函數調用,並提供一個方便的介面來執行推理。
In the project directory, create a header file called LLMInference.h.
在專案目錄中,建立一個名為 LLMInference.h 的頭檔。
In LLMInference.h, declare the following class:
在 LLMInference.h 中,宣告以下類別:
class LLMInference {
類別 LLMInference {
public:
民眾:
LLMInference(const std::string& model_path);
LLMInference(const std::string& model_path);
~LLMInference();
~LLMInference();
void startCompletion(const std::string& query);
void startCompletion(const std::string& query);
std::string completeNext();
std::stringcompleteNext();
private:
私人的:
llama_model llama_model_;
火焰_模型 火焰_模型_;
llama_context llama_context_;
llama_context llama_context_;
llama_sampler llama_sampler_;
call_sampler call_sampler_;
std::vector
std::向量_訊息;
std::vector
std::vector _formattedMessages;
std::vector
std::vector _tokens;
llama_batch batch_;
駱駝_批次_;
};
This class has a public constructor that takes the path to the GGUF LLM model as an argument and a destructor that deallocates any dynamically-allocated objects. It also has two public member functions: startCompletion, which initiates the completion process for a given query, and completeNext, which fetches the next token in the LLM's response sequence.
此類別具有一個公共建構函數,該構造函數將 GGUF LLM 模型的路徑作為參數,還有一個析構函數,用於釋放任何動態分配的物件。它還具有兩個公共成員函數:startCompletion(啟動給定查詢的完成過程)和completeNext(取得LLM 回應序列中的下一個標記)。
Step 4: Implementing LLM Inference Functions
第 4 步:實作 LLM 推理功能
Now, let's define the implementation for the LLMInference class in a file called LLMInference.cpp.
現在,讓我們在名為 LLMInference.cpp 的檔案中定義 LLMInference 類別的實作。
In LLMInference.cpp, include the necessary headers and implement the class methods as follows:
在 LLMInference.cpp 中,包含必要的標頭並實作類別方法,如下所示:
#include "LLMInference.h"
#include“LLMInference.h”
#include "common.h"
#include“common.h”
#include
#包括
#include
#包括
#include
#包括
LLMInference::LLMInference(const std::string& model_path) {
LLMInference::LLMInference(const std::string& model_path) {
llama_load_model_from_file(&llama_model_, model_path.c_str(), llama_model_default_params());
llama_load_model_from_file(&llama_model_, model_path.c_str(), llama_model_default_params());
llama_new_context_with_model(&llama_context_, &llama_model_);
llama_new_context_with_model(&llama_context_, &llama_model_);
llama_sampler_init_temp(&llama_sampler_, 0.8f);
llama_sampler_init_temp(&llama_sampler_, 0.8f);
llama_sampler_init_min_p(&llama_sampler_, 0.0f);
call_sampler_init_min_p(&call_sampler_, 0.0f);
}
LLMInference::~LLMInference() {
for (auto& msg : _messages) {
for (自動&訊息:_messages) {
std::free(msg.content);
std::free(msg.content);
}
}
llama_free_model(&llama_model_);
llama_free_model(&llama_model_);
llama_free_context(&llama_context_);
llama_free_context(&llama_context_);
}
void LLMInference::startCompletion(const std::string& query)
void LLMInference::startCompletion(const std::string& query)
免責聲明:info@kdj.com
The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research!
If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.
-
- 唐納德·特朗普總統和第一夫人梅拉尼婭·特朗普因在就職典禮前推出“模因幣”而受到批評
- 2025-01-21 08:50:40
- “Meme 幣”也稱為“狗屎幣”,是一類加密貨幣,包括狗狗幣和屁幣。
-
- 受美國總統川普啟發的 $TRUMP Memecoin 在主要交易所上市
- 2025-01-21 08:50:40
- $TRUMP 代幣是一種受美國總統唐納德·川普啟發的模因幣,已在幾家主要交易所上市。
-
- 本週最值得購買的加密貨幣,以獲得指數回報
- 2025-01-21 08:50:40
- 加密貨幣世界充滿了機遇,而這一切都是為了在正確的時間抓住正確的浪潮。