Data Science by ODS.ai 🦜

Meta-Transformer: A Unified Framework for Multimodal Learning

The landscape of multimodal learning is about to witness a remarkable transformation with the introduction of Meta-Transformer, a state-of-the-art framework that's poised to overcome long-standing challenges in the field. The beauty of Meta-Transformer lies in its unique ability to process and understand information from a diverse range of modalities - from natural language, 2D images, 3D point clouds, to audio, video, time series, and tabular data. This ability stems from its innovative design that leverages a frozen encoder to map raw input data from these diverse modalities into a shared token space, eliminating the need for paired multimodal training data.

More than just a theoretical achievement, the Meta-Transformer has proven its practical application across various benchmarks, handling an impressive range of tasks from fundamental perception such as text, image, and audio processing, to more complex applications like X-Ray, infrared, and hyperspectral data interpretation, as well as data mining tasks involving graph, tabular, and time-series data.

Code link: https://github.com/invictus717/MetaTransformer
Paper link: https://arxiv.org/abs/2307.10802

A detailed unofficial overview of the paper:
https://andlukyane.com/blog/paper-review-meta-transformer

#deeplearning #nlp #transformer #cv

👍8👨‍💻6🔥3❤2

9.79K views06:57

Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
In an effort to tackle the generation latency of large language models (LLMs), a new approach Skeleton-of-Thought (SoT) has been developed. Motivated by human thinking and writing processes, SoT guides LLMs to generate the "skeleton" of an answer first and then fills in the content in parallel. The result is a remarkable speed-up of up to 2.39x across 11 different LLMs without losing the integrity of sequential decoding.

What sets SoT apart is its potential to improve answer quality in terms of diversity and relevance, shedding light on an exciting avenue in AI. As an initial attempt at data-centric optimization for efficiency, SoT showcases the fascinating possibility of having machines that can think more like humans.

Paper link: https://arxiv.org/abs/2307.15337

A detailed unofficial overview of the paper:
https://andlukyane.com/blog/paper-review-sot

#deeplearning #nlp #llm

👍12❤4🔥3

8.93K views05:03

Data Science by ODS.ai 🦜

UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

The landscape of large language models (LLMs) has just been enhanced with the introduction of UniversalNER, a groundbreaking innovation using targeted distillation with mission-focused instruction tuning. The researchers managed to distill ChatGPT into more cost-efficient UniversalNER models without losing the quality of named entity recognition (NER). The study showcases how UniversalNER excels across an impressive array of 43 datasets in 9 diverse domains, outperforming other models like Alpaca and Vicuna by over 30 absolute F1 points on average.

What sets UniversalNER apart is its ability to acquire the capabilities of ChatGPT while having only a fraction of the parameters. It not only recognizes arbitrary entity types but even surpasses ChatGPT's NER accuracy by 7-9 absolute F1 points. Most remarkably, without any direct supervision, it manages to outclass even state-of-the-art multi-task systems like InstructUIE. This achievement is poised to be a game-changer in the field of NLP, offering a potent combination of efficiency and accuracy.

Paper link: https://arxiv.org/abs/2308.03279
Project link: https://universal-ner.github.io/

A detailed unofficial overview of the paper:
https://andlukyane.com/blog/paper-review-universalner

#deeplearning #nlp #llm #ner

👍10❤9🔥2

10.6K views05:48

Data Science by ODS.ai 🦜

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Reinforcement Learning from Human Feedback (RLHF), the key method for fine-tuning large language models (LLMs), is placed under the microscope in this paper. While recognizing RLHF's central role in aligning AI systems with human goals, the authors boldly tackle the uncharted territory of its flaws and limitations. They not only dissect open problems and the core challenges but also map out pioneering techniques to augment RLHF. This insightful work culminates in proposing practical standards for societal oversight, marking a critical step towards a multi-dimensional and responsible approach to the future of safer AI systems.

Paper link: https://arxiv.org/abs/2307.15217

A detailed unofficial overview of the paper:
https://andlukyane.com/blog/paper-review-rlhf-overview

#deeplearning #nlp #llm #rlhf

❤5👍4🔥1🤓1

9.87K views04:40

Data Science by ODS.ai 🦜

LISA: Reasoning Segmentation via Large Language Model

The field of image segmentation has taken a leap forward with the introduction of LISA (Large Language Instructed Segmentation Assistant). This cutting-edge model excels at "reasoning segmentation," a novel task that generates segmentation masks from complex and implicit text queries. Building upon the capabilities of multi-modal Large Language Models, LISA expands its vocabulary with a <SEG> token and introduces an innovative "embedding-as-mask" paradigm to achieve this feat. Notably, the model is adept at intricate reasoning, utilizes world knowledge, offers explanatory answers, and can handle multi-turn conversations.

What's astonishing about LISA is its robust zero-shot learning abilities. Even when trained on datasets that lack reasoning-based tasks, LISA performs impressively well. Moreover, when fine-tuned with just 239 specific reasoning segmentation image-instruction pairs, the model's performance is further enhanced.

Paper link: https://arxiv.org/abs/2308.00692
Code link: https://github.com/dvlab-research/LISA

A detailed unofficial overview of the paper:
https://andlukyane.com/blog/paper-review-lisa

#deeplearning #cv #nlp #imagesegmentation #largelanguagemodel

🔥11👍7

10.8K views04:37

Data Science by ODS.ai 🦜

OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

The OBELICS dataset is a game-changer in the world of machine learning and AI! Unlike existing closed-source datasets, OBELICS is a vast, open-source, web-scale dataset specially curated for training large multimodal models. Boasting 141 million web pages from Common Crawl, 353 million high-quality images, and an impressive 115 billion text tokens, OBELICS sets a new standard in the richness and diversity of training data.

But it's not just about the numbers; it's about results. To prove its mettle, models with 9 and 80 billion parameters were trained on OBELICS, showcasing competitive performance across various multimodal benchmarks. Named IDEFICS, these models outperformed or matched their closed-source counterparts, proving that OBELICS isn't just a theoretical concept—it's a practical, high-impact alternative.

Paper link: https://huggingface.co/papers/2306.16527
Model card link: https://huggingface.co/HuggingFaceM4/idefics-80b-instruct
Blogpost link: https://huggingface.co/blog/idefics

A detailed unofficial overview of the paper:
https://andlukyane.com/blog/paper-review-obelisc

#deeplearning #cv #nlp #largelanguagemodel #opensource

👍8🔥3❤2🥰1

13.6K views04:31

Data Science by ODS.ai 🦜

Giraffe: Adventures in Expanding Context Lengths in LLMs

Modern Large Language Models (LLMs) have revolutionized our ability to process and understand vast amounts of textual data. Yet, these models, like LLaMA and LLaMA2, often come with a caveat: they're constrained by fixed context lengths, which means they're limited in handling longer sequences of input data at evaluation. This paper tackles that constraint by investigating a variety of methods for "context length extrapolation," which essentially enables these models to understand and work with longer text sequences. Among the techniques explored, the paper introduces an innovative "truncated basis" strategy for altering positional encodings within the attention mechanism, promising a more scalable future for LLMs.

The researchers put their theories to the test with three brand-new evaluation tasks—FreeFormQA, AlteredNumericQA, and LongChat-Lines—providing a more nuanced measure of model performance than the traditionally used metric of perplexity. Their findings? Linear scaling came out on top as the most effective way to extend the context length, but the truncated basis method showed potential for future exploration. To propel the research community even further, the paper releases three game-changing long-context models, named Giraffe, with context lengths ranging from 4k to an astonishing 32k.

Paper link: https://arxiv.org/abs/2308.10882
Code link: https://github.com/abacusai/Long-Context

A detailed unofficial overview of the paper:
https://andlukyane.com/blog/paper-review-giraffe

#deeplearning #cv #nlp #largelanguagemodel #opensource #largecontext

👍13❤3🔥3

14.7K views05:26

Data Science by ODS.ai 🦜

RecMind: Large Language Model Powered Agent For Recommendation

Recent advancements have significantly improved the capabilities of Large Language Models (LLMs) in various tasks, yet their potential in the realm of personalized recommendations has been relatively unexplored. To address this gap, a new LLM-powered autonomous recommender agent called RecMind has been developed. RecMind is designed to provide highly personalized recommendations by leveraging planning algorithms, tapping into external data sources, and using individualized data.

One standout feature of RecMind is its novel "Self-Inspiring" algorithm, which enhances the model's planning abilities. During each step of planning, the algorithm encourages the model to consider all its past actions, thereby improving its understanding and use of historical data. The performance of RecMind has been evaluated across multiple recommendation tasks like rating prediction, sequential and direct recommendation, explanation generation, and review summarization. The results show that RecMind outperforms existing LLM-based methods in these tasks and is competitive with the specialized P5 model.

Paper link: https://arxiv.org/abs/2308.14296

A detailed unofficial overview of the paper:
https://andlukyane.com/blog/paper-review-recmind

#deeplearning #nlp #llm #recommender

👍17❤5🔥1

21.5K views04:38

Data Science by ODS.ai 🦜

Forwarded from Machinelearning

✔️

"Speech and Language Processing": 3-е издания книги

Этот открытый учебник считается де-факто стандартом и одним из самых авторитетных и всеобъемлющих ресурсов для изучения областей обработки естественного языка (NLP), вычислительной лингвистики и обработки речи.

🌟 Авторы: Дэн Джурафски и Джеймс Х. Мартин - известные фигуры в области NLP и вычислительной лингвистики. Книга считается классическим текстом, обновленным для включения современных методов, таких как трансформеры, которые доминируют в области NLP.

Книга разделена на три части, включающие 24 основные главы и 8 приложений.

Темы охватывают широкий спектр, включая:
😶Фундаментальные алгоритмы
😶Приложения NLP (Обработки Естественного Языка)
😶Регулярные выражения
😶Нейронные сети и трансформеры,
😶Машинный перевод и другие аспекты NLP
😶Аннотирование (или Разметка) лингвистической структуры.

Для каждой главы доступны слайды в форматах PPTX и PDF, что делает ресурс полезным для преподавателей.

Для всех, кто заинтересован в изучении NLP это фантастически полезный ресурс.

🟡

Книга в PDF

🟡

Все Главы

🟡

Еще книги по NLP

@ai_machinelearning_big_data

#freebook #opensource #nlp

Please open Telegram to view this post

VIEW IN TELEGRAM

👍7❤2🔥2

3.38K views11:20

Data Science by ODS.ai 🦜

Forwarded from Machinelearning