Getting Started With Keycloak Identity Provider (free Identity Server alternative)

Self-Host and Deploy Local LLAMA-3 with NIMs

#慧慧很努力#家庭搞笑#生活#亲子#记录

Мужицкий трос Ильдара Автоподбор

Праздничный рулет для 1🍋 подписчиков!

TRENDNI BOMBASI💣🔥 LADA

Deploy Open LLMs with LLAMA-CPP Server

Prompt Engineering

Подписаться 173 тыс.

Просмотров 9 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

3 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 12

@engineerprompt 3 месяца назад

If you want to build robust RAG applications based on your own datasets, this is for you: prompt-s-site.thinkific.com/courses/rag

@unclecode 3 месяца назад

👏 I'm glad to see you're focusing on DevOps options for AI apps. In my opinion, LlamaCpp will remain the best way to launch a production LLM server. One notable feature is its support for hardware-level concurrency. Using the `-np 4` (or `--parallel 4`) flag allows running 4 slots in parallel, where 4 can be any number of concurrent requests you want. One thing to remember the context window will be divided accordingly. For example, if you pass `-c 4096`, each slot will have a context size of 1024. Adding the `--n-gpu-layers` (`-ngl 99`) flag will offload the model layers to your GPU, providing the best performance. So, a command like `-c 4096 -np 4 -ngl 99` will offer excellent concurrency on a machine with a 4090 GPU.

@Nihilvs 3 месяца назад

amazing thanks !

@johnkost2514 3 месяца назад

Mozilla's Llamafile format is very flexible for deploying LLM(s) across operating systems. NIM has the advantage of bundling other types of models like audio or video.

@thecodingchallengeshow Месяц назад

can we finetune it using lora? i need it to be about ai so i have doqnloded data about ai and i want to add it to this model

@marcaodd 3 месяца назад

Which server specs did you use?

@engineerprompt 3 месяца назад

Its running on A6000 with 48GB vRAM. Hope that helps.

@andreawijayakusuma6008 3 месяца назад

bro, I wanna ask, do I need to use GPU to run this ?

@sadsagftrwre 3 месяца назад

No, llama-cpp specifically enables llms on cpus. its just going to be a bit slow, mate.

@andreawijayakusuma6008 2 месяца назад

@@sadsagftrwre oke thanks for the answer. I just want to tried it but afraid it won't worked without GPU.

@sadsagftrwre 2 месяца назад

@@andreawijayakusuma6008 I tried on CPU and it worked.

Далее

Getting Started With Keycloak Identity Provider (free Identity Server alternative)

12:24

Getting Started With Keycloak Identity Provider (free Identity Server alternative)

Просмотров 29 тыс.

Self-Host and Deploy Local LLAMA-3 with NIMs

13:08

Self-Host and Deploy Local LLAMA-3 with NIMs

Просмотров 7 тыс.

#慧慧很努力#家庭搞笑#生活#亲子#记录

00:11

#慧慧很努力#家庭搞笑#生活#亲子#记录

Просмотров 10 млн

Мужицкий трос Ильдара Автоподбор

00:39

Мужицкий трос Ильдара Автоподбор

Просмотров 112 тыс.

Праздничный рулет для 1🍋 подписчиков!

00:57

Праздничный рулет для 1🍋 подписчиков!

Просмотров 264 тыс.

TRENDNI BOMBASI💣🔥 LADA

00:28

TRENDNI BOMBASI💣🔥 LADA

Просмотров 497 тыс.

Демократизация LLM, llama.cpp, llava и что делаем в этом зоопарке

38:05

Демократизация LLM, llama.cpp, llava и что делаем в этом зоопарке

Просмотров 2,8 тыс.

The Best RAG Technique Yet? Anthropic’s Contextual Retrieval Explained!

16:14

The Best RAG Technique Yet? Anthropic’s Contextual Retrieval Explained!

Просмотров 53 тыс.

Swift, Server-Side, Serverless - Sébastien Stormacq

35:02

Swift, Server-Side, Serverless - Sébastien Stormacq

Просмотров 303

Local RAG with llama.cpp

8:38

Local RAG with llama.cpp

Просмотров 4,2 тыс.

Cheap mini runs a 70B LLM 🤯

11:22

Cheap mini runs a 70B LLM 🤯

Просмотров 131 тыс.

All You Need To Know About Running LLMs Locally

10:30

All You Need To Know About Running LLMs Locally

Просмотров 157 тыс.

Deploy AI Models to Production with NVIDIA NIM

12:08

Deploy AI Models to Production with NVIDIA NIM

Просмотров 10 тыс.

Adding Custom Models to Ollama

10:12

Adding Custom Models to Ollama

Просмотров 30 тыс.

GGUF quantization of LLMs with llama cpp

12:10

GGUF quantization of LLMs with llama cpp

Просмотров 3 тыс.

Web Applications Delivery: Part 05 - Server Side API

24:53

Web Applications Delivery: Part 05 - Server Side API

Просмотров 29

Внутри коробки iPhone 3G 📱

0:36

Внутри коробки iPhone 3G 📱

Просмотров 237 тыс.

Выключенный ПК потребляет электроэнергию?

0:30

Выключенный ПК потребляет электроэнергию?

Просмотров 606 тыс.

TEST: Cleaning iPhone charging Port with hot glue and compressed air ❌ #asmr #satisfying

0:26

TEST: Cleaning iPhone charging Port with hot glue and compressed air ❌ #asmr #satisfying

Просмотров 21 млн

3x 2x 1x 0.5x 0.3x... #iphone

0:10

3x 2x 1x 0.5x 0.3x... #iphone

Просмотров 2,7 млн

САМЫЙ ДОРОГОЙ Набор Геймера RAZER с DNS | Клавиатура, мышь, наушники, микрофон,стеклопад, колонки !

25:04

САМЫЙ ДОРОГОЙ Набор Геймера RAZER с DNS | Клавиатура, мышь, наушники, микрофон,стеклопад, колонки !

Просмотров 182 тыс.

iPhone Standby mode dock, designed with @overwerk

0:27

iPhone Standby mode dock, designed with @overwerk

Просмотров 6 млн

iPhone теперь БЕСПОЛЕЗНО воровать 🛠

0:25

iPhone теперь БЕСПОЛЕЗНО воровать 🛠

Просмотров 272 тыс.