市值: $3.5677T 1.700%
體積(24小時): $360.8053B 27.410%
  • 市值: $3.5677T 1.700%
  • 體積(24小時): $360.8053B 27.410%
  • 恐懼與貪婪指數:
  • 市值: $3.5677T 1.700%
Cryptos
主題
Cryptospedia
資訊
CryptosTopics
影片
Top News
Cryptos
主題
Cryptospedia
資訊
CryptosTopics
影片
bitcoin
bitcoin

$108064.256573 USD

2.62%

ethereum
ethereum

$3416.451426 USD

4.04%

xrp
xrp

$3.182014 USD

-0.61%

tether
tether

$0.998286 USD

-0.06%

solana
solana

$258.371362 USD

-5.60%

bnb
bnb

$703.182066 USD

-0.59%

dogecoin
dogecoin

$0.378176 USD

-4.38%

usd-coin
usd-coin

$1.000010 USD

-0.01%

cardano
cardano

$1.062758 USD

-0.47%

tron
tron

$0.239600 USD

-1.00%

chainlink
chainlink

$25.901897 USD

10.66%

avalanche
avalanche

$38.079479 USD

-2.52%

sui
sui

$4.720134 USD

-3.00%

stellar
stellar

$0.462876 USD

-3.68%

hedera
hedera

$0.354732 USD

0.20%

加密貨幣新聞文章

llama.cpp:為 GGUF LLM 模型寫簡單的 C++ 推理程序

2025/01/14 03:04

探索 llama.cpp 內部結構和基本聊天程式流程

llama.cpp:為 GGUF LLM 模型寫簡單的 C++ 推理程序

This tutorial will guide you through the process of building a simple C++ program that performs inference on GGUF LLM models using the llama.cpp framework. We will cover the essential steps involved in loading the model, performing inference, and displaying the results. The code for this tutorial can be found here.

本教學將引導您完成建立簡單的 C++ 程式的過程,該程式使用 llama.cpp 框架對 GGUF LLM 模型執行推理。我們將介紹載入模型、執行推理和顯示結果所涉及的基本步驟。本教程的程式碼可以在此處找到。

Prerequisites

先決條件

To follow along with this tutorial, you will need the following:

要學習本教程,您將需要以下內容:

A Linux-based operating system (native or WSL)

基於 Linux 的作業系統(本機或 WSL)

CMake installed

安裝了 CMake

GNU/clang toolchain installed

安裝 GNU/clang 工具鏈

Step 1: Setting Up the Project

第 1 步:設定項目

Let's start by setting up our project. We will be building a C/C++ program that uses llama.cpp to perform inference on GGUF LLM models.

讓我們從設定我們的項目開始。我們將建立一個 C/C++ 程序,使用 llama.cpp 對 GGUF LLM 模型執行推理。

Create a new project directory, let's call it smol_chat.

建立一個新的專案目錄,我們將其命名為 smol_chat。

Within the project directory, let's clone the llama.cpp repository into a subdirectory called externals. This will give us access to the llama.cpp source code and headers.

在專案目錄中,讓我們將 llama.cpp 儲存庫複製到名為 externals 的子目錄中。這將使我們能夠存取 llama.cpp 原始碼和標頭。

mkdir -p externals

mkdir -p 外部文件

cd externals

CD週邊

git clone https://github.com/georgigerganov/llama.cpp.git

git 複製 https://github.com/georgigerganov/llama.cpp.git

cd ..

光碟 ..

Step 2: Configuring CMake

步驟2:配置CMake

Now, let's configure our project to use CMake. This will allow us to easily compile and link our C/C++ code with the llama.cpp library.

現在,讓我們配置我們的專案以使用 CMake。這將使我們能夠輕鬆編譯 C/C++ 程式碼並將其與 llama.cpp 程式庫連結。

Create a CMakeLists.txt file in the project directory.

在專案目錄中建立 CMakeLists.txt 檔案。

In the CMakeLists.txt file, add the following code:

在CMakeLists.txt檔案中,加入以下程式碼:

cmake_minimum_required(VERSION 3.10)

cmake_minimum_required(版本3.10)

project(smol_chat)

項目(smol_chat)

set(CMAKE_CXX_STANDARD 20)

設定(CMAKE_CXX_STANDARD 20)

set(CMAKE_CXX_STANDARD_REQUIRED ON)

設定(CMAKE_CXX_STANDARD_REQUIRED ON)

add_executable(smol_chat main.cpp)

target_include_directories(smol_chat PUBLIC ${CMAKE_CURRENT_SOURCE_DIR})

target_link_libraries(smol_chat llama.cpp)

This code specifies the minimum CMake version, sets the C++ standard and standard flag, adds an executable named smol_chat, includes headers from the current source directory, and links the llama.cpp shared library to our executable.

此程式碼指定最低 CMake 版本,設定 C++ 標準和標準標誌,新增名為 smol_chat 的可執行文件,包含目前來源目錄中的標頭,並將 llama.cpp 共用程式庫連結到我們的執行檔。

Step 3: Defining the LLM Interface

第 3 步:定義 LLM 介面

Next, let's define a C++ class that will handle the high-level interactions with the LLM. This class will abstract away the low-level llama.cpp function calls and provide a convenient interface for performing inference.

接下來,我們定義一個 C++ 類別來處理與 LLM 的高階互動。該類別將抽像出低階 llama.cpp 函數調用,並提供一個方便的介面來執行推理。

In the project directory, create a header file called LLMInference.h.

在專案目錄中,建立一個名為 LLMInference.h 的頭檔。

In LLMInference.h, declare the following class:

在 LLMInference.h 中,宣告以下類別:

class LLMInference {

類別 LLMInference {

public:

民眾:

LLMInference(const std::string& model_path);

LLMInference(const std::string& model_path);

~LLMInference();

~LLMInference();

void startCompletion(const std::string& query);

void startCompletion(const std::string& query);

std::string completeNext();

std::stringcompleteNext();

private:

私人的:

llama_model llama_model_;

火焰_模型 火焰_模型_;

llama_context llama_context_;

llama_context llama_context_;

llama_sampler llama_sampler_;

call_sampler call_sampler_;

std::vector _messages;

std::向量_訊息;

std::vector _formattedMessages;

std::vector _formattedMessages;

std::vector _tokens;

std::vector _tokens;

llama_batch batch_;

駱駝_批次_;

};

This class has a public constructor that takes the path to the GGUF LLM model as an argument and a destructor that deallocates any dynamically-allocated objects. It also has two public member functions: startCompletion, which initiates the completion process for a given query, and completeNext, which fetches the next token in the LLM's response sequence.

此類別具有一個公共建構函數,該構造函數將 GGUF LLM 模型的路徑作為參數,還有一個析構函數,用於釋放任何動態分配的物件。它還具有兩個公共成員函數:startCompletion(啟動給定查詢的完成過程)和completeNext(取得LLM 回應序列中的下一個標記)。

Step 4: Implementing LLM Inference Functions

第 4 步:實作 LLM 推理功能

Now, let's define the implementation for the LLMInference class in a file called LLMInference.cpp.

現在,讓我們在名為 LLMInference.cpp 的檔案中定義 LLMInference 類別的實作。

In LLMInference.cpp, include the necessary headers and implement the class methods as follows:

在 LLMInference.cpp 中,包含必要的標頭並實作類別方法,如下所示:

#include "LLMInference.h"

#include“LLMInference.h”

#include "common.h"

#include“common.h”

#include

#包括

#include

#包括

#include

#包括

LLMInference::LLMInference(const std::string& model_path) {

LLMInference::LLMInference(const std::string& model_path) {

llama_load_model_from_file(&llama_model_, model_path.c_str(), llama_model_default_params());

llama_load_model_from_file(&llama_model_, model_path.c_str(), llama_model_default_params());

llama_new_context_with_model(&llama_context_, &llama_model_);

llama_new_context_with_model(&llama_context_, &llama_model_);

llama_sampler_init_temp(&llama_sampler_, 0.8f);

llama_sampler_init_temp(&llama_sampler_, 0.8f);

llama_sampler_init_min_p(&llama_sampler_, 0.0f);

call_sampler_init_min_p(&call_sampler_, 0.0f);

}

LLMInference::~LLMInference() {

for (auto& msg : _messages) {

for (自動&訊息:_messages) {

std::free(msg.content);

std::free(msg.content);

}

}

llama_free_model(&llama_model_);

llama_free_model(&llama_model_);

llama_free_context(&llama_context_);

llama_free_context(&llama_context_);

}

void LLMInference::startCompletion(const std::string& query)

void LLMInference::startCompletion(const std::string& query)

免責聲明:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research!

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

2025年01月21日 其他文章發表於