Medium / Medium.com – Telegram

Medium / Medium.com

1.23K subscribers

106K links

Just main page of medium.com fresh from the oven

Download Telegram

About

Blog

Apps

Platform

Medium / Medium.com

1.23K subscribers

Medium / Medium.com

Syntax Error-Free and Generalizable Tool Use for LLMs: ToolDec Eliminates Syntax Errors

#llms #toolaugmentation #syntaxerrors #decodingalgorithm #finitestatemachine #tooldec #toolselection #syntaxerrorfree

https://hackernoon.com/syntax-error-free-and-generalizable-tool-use-for-llms-tooldec-eliminates-syntax-errors

Syntax Error-Free and Generalizable Tool Use for LLMs: ToolDec Eliminates Syntax Errors | HackerNoon

Researchers propose TOOLDEC, a finite-state machine-guided decoding for LLMs, reducing errors and improving tool use.

21 views21:00

Medium / Medium.com

Syntax Error-Free and Generalizable Tool Use for LLMs: Appendix

#llms #toolaugmentation #syntaxerrors #decodingalgorithm #finitestatemachine #tooldec #toolselection #syntaxerrorfree

https://hackernoon.com/syntax-error-free-and-generalizable-tool-use-for-llms-appendix

Syntax Error-Free and Generalizable Tool Use for LLMs: Appendix | HackerNoon

Researchers propose TOOLDEC, a finite-state machine-guided decoding for LLMs, reducing errors and improving tool use.

21 views21:15

Medium / Medium.com

Meet The AI Tag-Team Method That Reduces Latency in Your Model's Response

#llms #decodingalgorithm #transformers #speculativedecoding #makellmsfaster #howtomakechatbotfaster #fasteraigeneration #speculativedecodingllms

https://hackernoon.com/meet-the-ai-tag-team-method-that-reduces-latency-in-your-models-response

Meet The AI Tag-Team Method That Reduces Latency in Your Model's Response

Speculative decoding is an advanced AI inference technique that is gaining traction in natural language processing (NLP) and other sequence generation tasks.

17 views18:00

Medium / Medium.com

How vLLM Implements Decoding Algorithms

#llms #vllm #decodingalgorithm #algorithms #endtoendservingsystem #gpubasedinference #cuda #python

https://hackernoon.com/how-vllm-implements-decoding-algorithms

How vLLM Implements Decoding Algorithms

vLLM implements various decoding algorithms using three key methods: fork, append, and free.

21 views00:01

Medium / Medium.com

How vLLM Can Be Applied to Other Decoding Scenarios

#llms #vllm #vllmapplications #decodingalgorithm #llmapplications #parallelsampling #osvirtualmemory #machinetranslation

https://hackernoon.com/how-vllm-can-be-applied-to-other-decoding-scenarios

How vLLM Can Be Applied to Other Decoding Scenarios

We show the general applicability of vLLM on them in this section.

31 views01:16

Medium / Medium.com

PagedAttention and vLLM Explained: What Are They?

#llms #vllm #pagedattention #llmservingsystem #decodingalgorithm #attentionalgorithm #virtualmemory #copyonwrite

https://hackernoon.com/pagedattention-and-vllm-explained-what-are-they

PagedAttention and vLLM Explained: What Are They?

This paper proposes PagedAttention, a new attention algorithm that allows attention keys and values to be stored in non-contiguous paged memory

42 views00:16