Multiagent Finetuning. Researchers Introduced multiagent finetuning, a novel approach for improving language models through self-improvement.
Unlike traditional single-agent finetuning methods that often plateau after a few iterations, this approach uses a society of language models derived from the same base model but independently specialized through multiagent interactions.
The method assigns some models as generation agents that produce initial responses, and others as critic agents that evaluate and refine those responses.
Through this specialization, the system maintains diverse reasoning chains and consistently improves over multiple rounds of fine-tuning.
They demonstrate significant performance gains across various reasoning tasks using both open-source models (Phi-3, Mistral, LLaMA-3) and proprietary models (GPT-3.5).
GitHub.
Unlike traditional single-agent finetuning methods that often plateau after a few iterations, this approach uses a society of language models derived from the same base model but independently specialized through multiagent interactions.
The method assigns some models as generation agents that produce initial responses, and others as critic agents that evaluate and refine those responses.
Through this specialization, the system maintains diverse reasoning chains and consistently improves over multiple rounds of fine-tuning.
They demonstrate significant performance gains across various reasoning tasks using both open-source models (Phi-3, Mistral, LLaMA-3) and proprietary models (GPT-3.5).
GitHub.
arXiv.org
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Large language models (LLMs) have achieved remarkable performance in recent years but are fundamentally limited by the underlying training data. To improve models beyond the training data, recent...
❤3
Google presents the successor to the Transformer architecture:
"TITANS: Learning to Memorize at Test Time"
Titans: a new architecture with attention and a meta in-context memory that learns how to memorize at test time. Titans are more effective than Transformers and modern linear RNNs, and can effectively scale to larger than 2M context window, with better performance than ultra-large models (e.g., GPT4, Llama3-80B).
"TITANS: Learning to Memorize at Test Time"
Titans: a new architecture with attention and a meta in-context memory that learns how to memorize at test time. Titans are more effective than Transformers and modern linear RNNs, and can effectively scale to larger than 2M context window, with better performance than ultra-large models (e.g., GPT4, Llama3-80B).
🆒7🔥3👀2
Reddit launched their own LLM-based search engine
It aggregates and summarizes relevant responses across threads + links you directly to read more.
Even if you assume GPT already includes all Reddit data, this is still helpful given the clear citations.
This is similar to the value of Perplexity over ChatGPT in clearer answers and fewer hallucinations.
It aggregates and summarizes relevant responses across threads + links you directly to read more.
Even if you assume GPT already includes all Reddit data, this is still helpful given the clear citations.
This is similar to the value of Perplexity over ChatGPT in clearer answers and fewer hallucinations.
Reddit
Reddit Answers
Got a question? Ask it and get answers, perspectives, and recommendations from all of Reddit
❤6
Nvidia introduced GenMol: A Drug Discovery Generalist with Discrete Diffusion
Demo.
1. GenMol introduces a versatile framework for drug discovery by leveraging discrete diffusion and the SAFE molecular representation. This allows for efficient and effective handling of various tasks in the drug discovery pipeline, from de novo molecule generation to lead optimization.
2. A key innovation in GenMol is fragment remasking, a strategy that replaces fragments in a molecule with masked tokens for regeneration. This approach simplifies the exploration of chemical space and improves the discovery of optimized molecules.
3. Unlike traditional autoregressive models, GenMol utilizes a bidirectional non-autoregressive decoding process. This significantly enhances computational efficiency and preserves generation quality, making it adaptable to fragment-constrained and goal-directed tasks.
4. Extensive experiments show GenMol's superior performance in de novo generation, fragment-constrained generation, and optimization tasks, often surpassing state-of-the-art methods such as SAFE-GPT. It achieves a better balance between molecule quality and diversity.
5. The framework's flexibility is demonstrated in tasks like hit generation and lead optimization, where it outperforms multiple specialized models by effectively exploring chemical space through its discrete diffusion architecture.
6. GenMol's unified approach to molecule generation avoids the need for task-specific fine-tuning, making it a practical tool for diverse applications in drug discovery while offering a scalable and efficient solution.
Demo.
1. GenMol introduces a versatile framework for drug discovery by leveraging discrete diffusion and the SAFE molecular representation. This allows for efficient and effective handling of various tasks in the drug discovery pipeline, from de novo molecule generation to lead optimization.
2. A key innovation in GenMol is fragment remasking, a strategy that replaces fragments in a molecule with masked tokens for regeneration. This approach simplifies the exploration of chemical space and improves the discovery of optimized molecules.
3. Unlike traditional autoregressive models, GenMol utilizes a bidirectional non-autoregressive decoding process. This significantly enhances computational efficiency and preserves generation quality, making it adaptable to fragment-constrained and goal-directed tasks.
4. Extensive experiments show GenMol's superior performance in de novo generation, fragment-constrained generation, and optimization tasks, often surpassing state-of-the-art methods such as SAFE-GPT. It achieves a better balance between molecule quality and diversity.
5. The framework's flexibility is demonstrated in tasks like hit generation and lead optimization, where it outperforms multiple specialized models by effectively exploring chemical space through its discrete diffusion architecture.
6. GenMol's unified approach to molecule generation avoids the need for task-specific fine-tuning, making it a practical tool for diverse applications in drug discovery while offering a scalable and efficient solution.
arXiv.org
GenMol: A Drug Discovery Generalist with Discrete Diffusion
Drug discovery is a complex process that involves multiple stages and tasks. However, existing molecular generative models can only tackle some of these tasks. We present Generalist Molecular...
OpenAI just launched 'Tasks', allowing users to schedule actions and reminders within ChatGPT.
Tasks can be one-time reminders or recurring actions (like a daily news rundown), with up to 10 active tasks able to be scheduled at a time.
A new 4o with scheduled tasks model will be available in the dropdown menu, and ChatGPT will also be able to suggest frequent tasks from a user's chat history.
The beta feature is rolling out to Plus, Team, and Pro ChatGPT subscribers over the next few days.
Tasks can be one-time reminders or recurring actions (like a daily news rundown), with up to 10 active tasks able to be scheduled at a time.
A new 4o with scheduled tasks model will be available in the dropdown menu, and ChatGPT will also be able to suggest frequent tasks from a user's chat history.
The beta feature is rolling out to Plus, Team, and Pro ChatGPT subscribers over the next few days.
Transformer²: Self-adaptive LLMs
This new paper from Sakana AI shows the power of an LLM that can self-adapt its weights to its environment.
In the future, the line between “pre-training” and “post-training” will be gone, and our models and agents will continuously adapt and self-improve.
Systems like these will pave the way for a new generation of adaptive AI capable of modifying their own weights and architecture to adapt to the changing nature of the tasks they encounter in the environment.
GitHub.
This new paper from Sakana AI shows the power of an LLM that can self-adapt its weights to its environment.
In the future, the line between “pre-training” and “post-training” will be gone, and our models and agents will continuously adapt and self-improve.
Systems like these will pave the way for a new generation of adaptive AI capable of modifying their own weights and architecture to adapt to the changing nature of the tasks they encounter in the environment.
GitHub.
arXiv.org
Transformer-Squared: Self-adaptive LLMs
Self-adaptive large language models (LLMs) aim to solve the challenges posed by traditional fine-tuning methods, which are often computationally intensive and static in their ability to handle...
🔥5🦄2
Another top notch open source model at OpenAI/Meta/Google levels from &MiniMax AI (Chinese lab, ex Sensetime, $850m raised). Massive MoE similar to Deep-seek.
Excels on long context (4m tokens!) which is really interesting, need to dig into their lighting attention variant.
Paper.
Excels on long context (4m tokens!) which is really interesting, need to dig into their lighting attention variant.
Paper.
agent.minimax.io
MiniMax Agent: Minimize Effort, Maximize Intelligence
Discover MiniMax Agent, your AI supercompanion, enhancing creativity and productivity with tools for meditation, podcast, coding, analysis, and more!
🔥5❤3
The Berkeley Sky computing lab just trained Sky-T1-32B-Preview, a GPT-o1 level reasoning model, spending only $450 to create the instruction dataset.
The data is 17K math and coding problems solved step by step. They created this dataset by prompting QwQ at $450 cost.
Can it be done without another reasoning model to distill?
Teach a 1000 student class and assign 17 homework problems. Side benefit: make $10M by charging $10K tuition.
Model data and full code here. Very interesting work that shows simple SFT is all you need (if you have good data).
The data is 17K math and coding problems solved step by step. They created this dataset by prompting QwQ at $450 cost.
Can it be done without another reasoning model to distill?
Teach a 1000 student class and assign 17 homework problems. Side benefit: make $10M by charging $10K tuition.
Model data and full code here. Very interesting work that shows simple SFT is all you need (if you have good data).
novasky-ai.github.io
Sky-T1: Train your own O1 preview model within $450
We introduce Sky-T1-32B-Preview, our reasoning model that performs on par with o1-preview on popular reasoning and coding benchmarks.
The next gen of speech language models can talk while listening.
Moshi (Kyutai) demo.
LSLM (ByteDance) demo.
Hertz-dev (SI)
These “full duplex” models, like what powers ChatGPT Advanced Voice Mode, respond in <200ms with human-like quality.
ASR + LLM + TTS is much slower.
Moshi (Kyutai) demo.
LSLM (ByteDance) demo.
Hertz-dev (SI)
These “full duplex” models, like what powers ChatGPT Advanced Voice Mode, respond in <200ms with human-like quality.
ASR + LLM + TTS is much slower.
This is brilliant work in the robotics space and has serious implications for digital only agents as well.
Physical Intelligence released FAST, an efficient tokenizer for robot actions.
With FAST, you can train dexterous generalist policies via simple next token prediction, and get a 5x training speed-up over prior state of the art!
Physical Intelligence released FAST, an efficient tokenizer for robot actions.
With FAST, you can train dexterous generalist policies via simple next token prediction, and get a 5x training speed-up over prior state of the art!
www.pi.website
FAST: Efficient Robot Action Tokenization
A new robot action tokenizer that allows us to train generalist policies 5x faster than previous models.
Hugging Face released a free course on agents.
You can learn how to create:
- Code agents that solve problem with code
- Retrieval agents that supply grounded context
- Custom functional agents that do whatever you need
Access the course on GitHub.
Scroll down to access the different parts.After completing one part, a link redirects you to the next, etc.
You can learn how to create:
- Code agents that solve problem with code
- Retrieval agents that supply grounded context
- Custom functional agents that do whatever you need
Access the course on GitHub.
Scroll down to access the different parts.After completing one part, a link redirects you to the next, etc.
😁1
#DeepSeek-R1 is here! Performance on par with OpenAI-o1. Fully open-source model & technical report. MIT licensed: Distill & commercialize freely.
Try DeepThink.
API guide.
Bonus: Open-Source Distilled Models!
Distilled from DeepSeek-R1, 6 small models fully open-sourced. 32B & 70B models on par with OpenAI-o1-mini.
Try DeepThink.
API guide.
Bonus: Open-Source Distilled Models!
Distilled from DeepSeek-R1, 6 small models fully open-sourced. 32B & 70B models on par with OpenAI-o1-mini.
GitHub
DeepSeek-R1/DeepSeek_R1.pdf at main · deepseek-ai/DeepSeek-R1
Contribute to deepseek-ai/DeepSeek-R1 development by creating an account on GitHub.
❤1😁1
OpenAI website already has references to Operator/OpenAI CUA (Computer Use Agent) - "Operator System Card Table", "Operator Research Eval Table" and "Operator Refusal Rate Table"
Including comparison to Claude 3.5 Sonnet Computer use, Google Mariner, etc.
(preview of tables rendered using Claude Artifacts).
Including comparison to Claude 3.5 Sonnet Computer use, Google Mariner, etc.
(preview of tables rendered using Claude Artifacts).
Agentic RAG Overview. This is a good intro to LLM agents and Agentic RAG.
It provides a comprehensive exploration of Agentic RAG architectures, applications, and implementation strategies.
It provides a comprehensive exploration of Agentic RAG architectures, applications, and implementation strategies.
arXiv.org
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG
Large Language Models (LLMs) have revolutionized artificial intelligence (AI) by enabling human like text generation and natural language understanding. However, their reliance on static training...
🔥10
AI-powered hedge fund using multiple agents for trading decisions.
GitHub
GitHub - virattt/ai-hedge-fund: An AI Hedge Fund Team
An AI Hedge Fund Team. Contribute to virattt/ai-hedge-fund development by creating an account on GitHub.
🦄5
ML + cell-free systems to engineer enzymes quickly.
The authors tested enzyme variants in ~11k cell-free reactions, using the data to build a regression model to make better amide synthetases.
Result: 1.6- to 42-fold higher activity. We'll be seeing many more papers like this.
The authors tested enzyme variants in ~11k cell-free reactions, using the data to build a regression model to make better amide synthetases.
Result: 1.6- to 42-fold higher activity. We'll be seeing many more papers like this.
Nature
Accelerated enzyme engineering by machine-learning guided cell-free expression
Nature Communications - While machine learning shows promise in expanding protein engineering efforts, its potential is limited by the challenge of gathering large datasets of sequence-function...
🦄5❤1👍1
OpenAI announced The Stargate Project
The Stargate Project is a new company which intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States.
OpenAI will begin deploying $100 billion immediately.
The initial equity funders in Stargate are SoftBank, OpenAI, Oracle, and MGX. SoftBank and OpenAI are the lead partners for Stargate, with SoftBank having financial responsibility and OpenAI having operational responsibility. Masayoshi Son will be the chairman.
Arm, Microsoft, NVIDIA, Oracle, and OpenAI are the key initial technology partners. The buildout is currently underway, starting in Texas, and we are evaluating potential sites across the country for more campuses as we finalize definitive agreements.
As part of Stargate, Oracle, NVIDIA, and OpenAI will closely collaborate to build and operate this computing system. This builds on a deep collaboration between OpenAI and NVIDIA going back to 2016 and a newer partnership between OpenAI and Oracle.
This also builds on the existing OpenAI partnership with Microsoft. OpenAI will continue to increase its consumption of Azure as OpenAI continues its work with Microsoft with this additional compute to train leading models and deliver great products and services.
The Stargate Project is a new company which intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States.
OpenAI will begin deploying $100 billion immediately.
The initial equity funders in Stargate are SoftBank, OpenAI, Oracle, and MGX. SoftBank and OpenAI are the lead partners for Stargate, with SoftBank having financial responsibility and OpenAI having operational responsibility. Masayoshi Son will be the chairman.
Arm, Microsoft, NVIDIA, Oracle, and OpenAI are the key initial technology partners. The buildout is currently underway, starting in Texas, and we are evaluating potential sites across the country for more campuses as we finalize definitive agreements.
As part of Stargate, Oracle, NVIDIA, and OpenAI will closely collaborate to build and operate this computing system. This builds on a deep collaboration between OpenAI and NVIDIA going back to 2016 and a newer partnership between OpenAI and Oracle.
This also builds on the existing OpenAI partnership with Microsoft. OpenAI will continue to increase its consumption of Azure as OpenAI continues its work with Microsoft with this additional compute to train leading models and deliver great products and services.
Openai
Announcing The Stargate Project
❤6
❗️OpenAI recently made a new, large Azure commitment that will continue to support all OpenAI products as well as training. This new agreement also includes changes to the exclusivity on new capacity, moving to a model where Microsoft has a right of first refusal (ROFR). To further support OpenAI, Microsoft has approved OpenAI’s ability to build additional capacity, primarily for research and training of models.
The Official Microsoft Blog
Microsoft and OpenAI evolve partnership to drive the next phase of AI
We are thrilled to continue our strategic partnership with OpenAI and to partner on Stargate. Today’s announcement is complementary to what our two companies have been working on together since 2019. The key elements of our partnership remain in place for…