Meta said: โWe will release a multimodal Llama model over the coming months, but not in the EU due to the unpredictable nature of the European regulatory environmentโ.
Unless the EU changes course here's what won't be coming to Europe:
- Apple Intelligence
- Agent Memory, all agents
- Llama 4, and beyond
Unless the EU changes course here's what won't be coming to Europe:
- Apple Intelligence
- Agent Memory, all agents
- Llama 4, and beyond
Axios
Scoop: Meta won't offer future multimodal AI models in EU
The social media giant says EU regulators haven't provided clarity.
๐4๐2
Everybody is talking about ColPali, a new retrieval model architecture that uses vision language models to directly embed page images, without relying on complex text extraction pipelines.
Combined with a late interaction matching mechanism, ColPali largely outperforms modern document retrieval pipelines while being drastically faster and end-to-end trainable.
Combined with a late interaction matching mechanism, ColPali largely outperforms modern document retrieval pipelines while being drastically faster and end-to-end trainable.
arXiv.org
ColPali: Efficient Document Retrieval with Vision Language Models
Documents are visually rich structures that convey information through text, but also figures, page layouts, tables, or even fonts. Since modern retrieval systems mainly rely on the textual...
Menlo Ventures launched the $100M Anthology Fund, an Anthropic partnership to fund Seed and Series As of the next generation of AI startups around the world
Startups will get:
โ $25,000 Anthropic credits
โ Access to Anthropic's AI models
โ quarterly deep dives with the Anthropic team
โ biannual demo days hosted by Anthropic CPO Mike Krieger and Cofounder Daniela Amodei
โ credits from Menlo Ventures company
Startups will get:
โ $25,000 Anthropic credits
โ Access to Anthropic's AI models
โ quarterly deep dives with the Anthropic team
โ biannual demo days hosted by Anthropic CPO Mike Krieger and Cofounder Daniela Amodei
โ credits from Menlo Ventures company
Airtable
Airtable | Everyone's app platform
Airtable is a low-code platform for building collaborative apps. Customize your workflow, collaborate, and achieve ambitious outcomes. Get started for free.
Here's OpenAIโs new model:'GPT-4o mini'.
The company called the new release โthe most capable and cost-efficient small model available today,โ and it plans to integrate image, video and audio into it later.
The mini AI model is an offshoot of GPT-4o, OpenAIโs fastest and most powerful model yet, which it launched in May during a livestreamed event with executives. The o in GPT-4o stands for omni, and GPT-4o has improved audio, video and text capabilities, with the ability to handle 50 different languages with improved speed and quality, according to the company.
The company called the new release โthe most capable and cost-efficient small model available today,โ and it plans to integrate image, video and audio into it later.
The mini AI model is an offshoot of GPT-4o, OpenAIโs fastest and most powerful model yet, which it launched in May during a livestreamed event with executives. The o in GPT-4o stands for omni, and GPT-4o has improved audio, video and text capabilities, with the ability to handle 50 different languages with improved speed and quality, according to the company.
CNBC
OpenAI debuts mini version of its most powerful model yet
OpenAI on Thursday launched a new AI model, "GPT-4o mini," the artificial intelligence startup's latest effort to expand use of its popular chatbot.
๐
1
Check out OpenAIโs new model GPT-4o mini : 82% MMLU at 60 cents per 1M output tokens!
OpenAI has talked to Broadcom about developing new AI chip
OpenAI has been hiring former members of a Google unit that produces Googleโs AI chip, the tensor processing unit, and has sought to develop an AI server chip.
OpenAI has been talking to chip designers including Broadcom about working on the chip.
The team has discussed how the eventual chip could help the new venture Altman has envisioned, which aims to increase the amount of computing power for AI developers such as OpenAI.
OpenAI has been hiring former members of a Google unit that produces Googleโs AI chip, the tensor processing unit, and has sought to develop an AI server chip.
OpenAI has been talking to chip designers including Broadcom about working on the chip.
The team has discussed how the eventual chip could help the new venture Altman has envisioned, which aims to increase the amount of computing power for AI developers such as OpenAI.
The Information
OpenAI Has Talked to Broadcom About Developing New AI Chip
Last year, as the worldโs top artificial intelligence developers were racing to speed up their work using ever-bigger clusters of computers, OpenAI CEO Sam Altman was trying to play a longer game. He decided to start a new company that could develop and produceโฆ
AS_1721383577.pdf
5.5 MB
White paper on ๐๐ ๐ณ๐ผ๐ฟ ๐ฃ๐ผ๐ฝ๐๐น๐ฎ๐๐ถ๐ผ๐ป ๐๐ฒ๐ฎ๐น๐๐ต ๐ฎ๐ป๐ฑ ๐๐ถ๐ด๐ถ๐๐ฎ๐น ๐๐ฒ๐ฎ๐น๐๐ต ๐ถ๐ป ๐ฆ๐ถ๐ป๐ด๐ฎ๐ฝ๐ผ๐ฟ๐ฒ.
๐๐ฒ๐ ๐๐ถ๐ด๐ต๐น๐ถ๐ด๐ต๐๐:
1. ๐๐ผ๐น๐น๐ฎ๐ฏ๐ผ๐ฟ๐ฎ๐๐ถ๐๐ฒ ๐๐ฐ๐ผ๐๐๐๐๐ฒ๐บ: The partnership between ASTAR and EVYD leverages the unique strengths of both organizations, creating a robust ecosystem for AI-driven healthcare solutions. ASTAR's research prowess combined with EVYD's commercial expertise paves the way for groundbreaking advancements in population health management.
2. ๐๐ ๐ณ๐ผ๐ฟ ๐ฃ๐ฟ๐ฒ๐๐ฒ๐ป๐๐ถ๐๐ฒ ๐๐ฎ๐ฟ๐ฒ: One of the standout perspectives of this white paper is its emphasis on shifting from reactive to preventive healthcare. AI's capability to analyze massive datasets enables early disease detection, personalized interventions, and proactive health management, fundamentally altering traditional healthcare paradigmsโ.
3. ๐ฆ๐ฐ๐ฎ๐น๐ฎ๐ฏ๐น๐ฒ ๐ฆ๐ผ๐น๐๐๐ถ๐ผ๐ป๐: The Joint Lab's focus on developing scalable platforms and advanced data aggregation techniques is crucial. These solutions not only enhance public health surveillance but also ensure that AI technologies can be effectively integrated into real-world healthcare settings, providing tangible benefits to patients and healthcare providers alikeโ.
4. ๐๐น๐ผ๐ฏ๐ฎ๐น ๐ข๐๐๐ฟ๐ฒ๐ฎ๐ฐ๐ต: The Joint Lab's initiatives are not confined to Singapore. With successful deployments in Brunei and upcoming collaborations in the UAE, the impact of this partnership is set to drive meaningful change in global healthcare landscapesโ.
5. ๐๐๐ต๐ถ๐ฐ๐ฎ๐น ๐๐ผ๐ป๐๐ถ๐ฑ๐ฒ๐ฟ๐ฎ๐๐ถ๐ผ๐ป๐ ๐ฎ๐ป๐ฑ ๐๐ผ๐๐ฒ๐ฟ๐ป๐ฎ๐ป๐ฐ๐ฒ: Balancing innovation with ethical considerations is a critical aspect addressed in the white paper. Collaborative efforts from diverse stakeholders ensure that AI technologies improve health outcomes without compromising individual rights or safetyโ.
๐๐ฒ๐ ๐๐ถ๐ด๐ต๐น๐ถ๐ด๐ต๐๐:
1. ๐๐ผ๐น๐น๐ฎ๐ฏ๐ผ๐ฟ๐ฎ๐๐ถ๐๐ฒ ๐๐ฐ๐ผ๐๐๐๐๐ฒ๐บ: The partnership between ASTAR and EVYD leverages the unique strengths of both organizations, creating a robust ecosystem for AI-driven healthcare solutions. ASTAR's research prowess combined with EVYD's commercial expertise paves the way for groundbreaking advancements in population health management.
2. ๐๐ ๐ณ๐ผ๐ฟ ๐ฃ๐ฟ๐ฒ๐๐ฒ๐ป๐๐ถ๐๐ฒ ๐๐ฎ๐ฟ๐ฒ: One of the standout perspectives of this white paper is its emphasis on shifting from reactive to preventive healthcare. AI's capability to analyze massive datasets enables early disease detection, personalized interventions, and proactive health management, fundamentally altering traditional healthcare paradigmsโ.
3. ๐ฆ๐ฐ๐ฎ๐น๐ฎ๐ฏ๐น๐ฒ ๐ฆ๐ผ๐น๐๐๐ถ๐ผ๐ป๐: The Joint Lab's focus on developing scalable platforms and advanced data aggregation techniques is crucial. These solutions not only enhance public health surveillance but also ensure that AI technologies can be effectively integrated into real-world healthcare settings, providing tangible benefits to patients and healthcare providers alikeโ.
4. ๐๐น๐ผ๐ฏ๐ฎ๐น ๐ข๐๐๐ฟ๐ฒ๐ฎ๐ฐ๐ต: The Joint Lab's initiatives are not confined to Singapore. With successful deployments in Brunei and upcoming collaborations in the UAE, the impact of this partnership is set to drive meaningful change in global healthcare landscapesโ.
5. ๐๐๐ต๐ถ๐ฐ๐ฎ๐น ๐๐ผ๐ป๐๐ถ๐ฑ๐ฒ๐ฟ๐ฎ๐๐ถ๐ผ๐ป๐ ๐ฎ๐ป๐ฑ ๐๐ผ๐๐ฒ๐ฟ๐ป๐ฎ๐ป๐ฐ๐ฒ: Balancing innovation with ethical considerations is a critical aspect addressed in the white paper. Collaborative efforts from diverse stakeholders ensure that AI technologies improve health outcomes without compromising individual rights or safetyโ.
๐4
New paper "Improving the Efficiency of #Payments Systems Using #Quantum Computing" from Bank of Canada, GoodLabs Studio and University of Waterloo
Two takeaways:
1. Reordering algorithms are a promising avenue for improving the efficiency of financial infrastructures relying on gross settlement.
2. The hybrid quantum solution to the reordering problem proved to be more reliable, consistent, and scalable than the classical computing hardware, particularly under constraints on the solve time.
Two takeaways:
1. Reordering algorithms are a promising avenue for improving the efficiency of financial infrastructures relying on gross settlement.
2. The hybrid quantum solution to the reordering problem proved to be more reliable, consistent, and scalable than the classical computing hardware, particularly under constraints on the solve time.
pubsonline.informs.org
Improving the Efficiency of Payments Systems Using Quantum Computing | Management Science
High-value payment systems (HVPSs) are typically liquidity intensive because payments are settled on a gross basis. State-of-the-art solutions to this problem include algorithms that seek netting s...
VantAI, NVIDIA introduced PINDER & PLINDER are the Protein-protein/Ligand INteraction Dataset and Evaluation Resource - >500x and 10x larger datasets than previous datasets and provide predefined splits that leverage novel interface-clustering and splitting procedure together with strict quality criteria to ensure accurate and fair performance evaluation.
Researchers find that current performance estimates are not only inflated by up to 2-3x, but surprisingly & excitingly, also that deliberate splits + clustering allows training models that generalize much better.
Via extensive benchmarking and retraining of DiffDock(-PP), enabled by NVIDIAโs BioNeMo framework, researchers show that clustering & splits can dramatically improve generalization.
Paper.
GitHub and this.
Researchers find that current performance estimates are not only inflated by up to 2-3x, but surprisingly & excitingly, also that deliberate splits + clustering allows training models that generalize much better.
Via extensive benchmarking and retraining of DiffDock(-PP), enabled by NVIDIAโs BioNeMo framework, researchers show that clustering & splits can dramatically improve generalization.
Paper.
GitHub and this.
bioRxiv
PLINDER: The protein-ligand interactions dataset and evaluation resource
Protein-ligand interactions (PLI) are foundational to small molecule drug design. With computational methods striving towards experimental accuracy, there is a critical demand for a well-curated and diverse PLI dataset. Existing datasets are often limitedโฆ
If you are crazy ambitious, technical, and ready to start building a multi billion dollar company, you should apply for the South Park Commons Founder Fellowship.
Apply here by August 9.
Apply here by August 9.
Patronus AI announced the release of 'Lynx', a new open-source hallucination detection model
They claim that it outperforms existing AI models such as GPT-4, Claude-3-Sonnet, and more.
They are also open sourcing new hallucination benchmark HaluBench. HaluBench is a large-scale 15k sample dataset that contains challenging hallucination tasks and supports diverse real world domains like finance and medicine.
HuggingFace Lynx 8B.
HF 70B
You can use quantized Lynx-8B locally, deploy Lynx-70B with GPUs, or reach out to Patronus AI for easy API access.
They claim that it outperforms existing AI models such as GPT-4, Claude-3-Sonnet, and more.
They are also open sourcing new hallucination benchmark HaluBench. HaluBench is a large-scale 15k sample dataset that contains challenging hallucination tasks and supports diverse real world domains like finance and medicine.
HuggingFace Lynx 8B.
HF 70B
You can use quantized Lynx-8B locally, deploy Lynx-70B with GPUs, or reach out to Patronus AI for easy API access.
arXiv.org
Lynx: An Open Source Hallucination Evaluation Model
Retrieval Augmented Generation (RAG) techniques aim to mitigate hallucinations in Large Language Models (LLMs). However, LLMs can still produce information that is unsupported or contradictory to...
AI at work. Report by the BCG
Top 10 TAKEAWAYS:
1. 2 out of 3 business leaders using Gen AI.
2. 50% of employees saving 5+ hours a week.
3. 50% of employees believe their job will disappear.
4. High potential: TMT, Finance, Energy, Healthcare, Consumer.
5. Brazil, India, MENA, Nigeria, S Africa >> Mature markets.
6. Top benefits: Save time, Move fast, Improve quality.
7. The more you use, the more you like.
8. Frontline workers beginning to embrace Gen AI.
9. Deploying Gen AI = Management, not Technology, challenge.
10. Train. Train. Train.
Top 10 TAKEAWAYS:
1. 2 out of 3 business leaders using Gen AI.
2. 50% of employees saving 5+ hours a week.
3. 50% of employees believe their job will disappear.
4. High potential: TMT, Finance, Energy, Healthcare, Consumer.
5. Brazil, India, MENA, Nigeria, S Africa >> Mature markets.
6. Top benefits: Save time, Move fast, Improve quality.
7. The more you use, the more you like.
8. Frontline workers beginning to embrace Gen AI.
9. Deploying Gen AI = Management, not Technology, challenge.
10. Train. Train. Train.
BCG Global
AI at Work 2024: Friend and Foe
In the past year, workersโ confidence in GenAI has grown. So has their fear of job loss. Companies can address these dueling perspectives through deliberate thought and strategic action.
Apple released a 7B model that beats Mistral 7B.
They fully open sourced everything, including weights, training code, and dataset.
The secret to its performance: Data curation.
They released the best open LLM train dataset and a full pipeline for evaluating data curation methods.
Model
GitHub
Dataset
Paper
They fully open sourced everything, including weights, training code, and dataset.
The secret to its performance: Data curation.
They released the best open LLM train dataset and a full pipeline for evaluating data curation methods.
Model
GitHub
Dataset
Paper
huggingface.co
apple/DCLM-7B ยท Hugging Face
Weโre on a journey to advance and democratize artificial intelligence through open source and open science.
โค3๐2
Researchers released the NuminaMath datasets: the largest collection of ~1M math competition problem-solution pairs, ranging in difficulty from junior challenge to Math Olympiad preselection.
These datasets were used to win the 1st Progress Prize of the AI Math Olympiad and consist of two subsets:
1. Chain of Thought (CoT): 860k problem-solution pairs templated with CoT to enhance mathematical reasoning in natural language
2. Tool-integrated reasoning (TIR): 73k synthetic solutions derived from GPT-4 with code-execution feedback to decompose hard problems into simpler subproblems that can be solved with Python
Models trained on NuminaMath achieve best-in-class performance among open weight models and approach or surpass proprietary models on math competition benchmarks.
Tech report along with the training and inference code.
These datasets were used to win the 1st Progress Prize of the AI Math Olympiad and consist of two subsets:
1. Chain of Thought (CoT): 860k problem-solution pairs templated with CoT to enhance mathematical reasoning in natural language
2. Tool-integrated reasoning (TIR): 73k synthetic solutions derived from GPT-4 with code-execution feedback to decompose hard problems into simpler subproblems that can be solved with Python
Models trained on NuminaMath achieve best-in-class performance among open weight models and approach or surpass proprietary models on math competition benchmarks.
Tech report along with the training and inference code.
huggingface.co
NuminaMath - a AI-MO Collection
Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize
๐ฅ4
An article on how you can dynamically spawn and place your content objects on specific surface types using Meta XR Mixed Reality Utility Kit (MRUK), based on Scene Understanding's semantically labeled surfaces.
Mixed Reality Now โ ARโVRโMRโXR Design & Development Stories
Building MR apps using physical surfaces with Meta MR Utility Kit โ Mixed Reality Now
Being able to use the real-life physical environment as a canvas is one of the most exciting parts of Mixed Reality for us designers, developers, and creators. With Meta Quest's various sophisticated spatial awareness capabilities such as Scene Understandingโฆ
โค3
OpenAI CEO Sam Altman low-income people $1,000/month for three years, no strings attached. Hereโs What Happened.
Now, the results of one of the largest guaranteed-basic-income studies are in.
Back in 2019, 3,000 Texas and Illinois residents were enrolled in this guaranteed-basic-income study.
The experiment was funded by Sam Altman, who raised $60 million for the study ($14 million of which was his own money).
All participants had incomes below $28,000.
1/3 got $1,000/month for three years while the remaining control group members got $50 per month.
For those who received the $1,000 payments, overall spending increased by ~$310/month.
Most of that money went toward food, transportation, and rent.
There was also an increase in offering financial support to others.
However, that doesn't mean those who received the $1,000 payments saw improvements across the board.
There was no "direct evidence of improved access to healthcare or improvements to physical and mental health," researchers found.
While there was an increase in life satisfaction for a short time at the start of the study, it didn't last.
"Cash alone cannot address challenges such as chronic health conditions, lack of childcare, or the high cost of housing."
But there were certainly some benefits that can't be ignored.
For those receiving $1,000/month, their total individual savings spiked by almost 25%.
And incomes rose significantly for all groups.
Recipients of the $1,000/month saw incomes rise from ~$30,000 to $45,710, on average.
Incomes for those in the control group rose even more, to $50,970.
"Cash offers flexibility and may increase agency to make employment decisions that align with recipients' individual circumstances, goals, and values," according to researchers.
As for what the study participants themselves felt, most couldn't believe their luck when selected to participate.
"Looking back, I regret that I didn't save more of it," one said.
"It's almost like a miracle," another said.
Now, the results of one of the largest guaranteed-basic-income studies are in.
Back in 2019, 3,000 Texas and Illinois residents were enrolled in this guaranteed-basic-income study.
The experiment was funded by Sam Altman, who raised $60 million for the study ($14 million of which was his own money).
All participants had incomes below $28,000.
1/3 got $1,000/month for three years while the remaining control group members got $50 per month.
For those who received the $1,000 payments, overall spending increased by ~$310/month.
Most of that money went toward food, transportation, and rent.
There was also an increase in offering financial support to others.
However, that doesn't mean those who received the $1,000 payments saw improvements across the board.
There was no "direct evidence of improved access to healthcare or improvements to physical and mental health," researchers found.
While there was an increase in life satisfaction for a short time at the start of the study, it didn't last.
"Cash alone cannot address challenges such as chronic health conditions, lack of childcare, or the high cost of housing."
But there were certainly some benefits that can't be ignored.
For those receiving $1,000/month, their total individual savings spiked by almost 25%.
And incomes rose significantly for all groups.
Recipients of the $1,000/month saw incomes rise from ~$30,000 to $45,710, on average.
Incomes for those in the control group rose even more, to $50,970.
"Cash offers flexibility and may increase agency to make employment decisions that align with recipients' individual circumstances, goals, and values," according to researchers.
As for what the study participants themselves felt, most couldn't believe their luck when selected to participate.
"Looking back, I regret that I didn't save more of it," one said.
"It's almost like a miracle," another said.
OpenResearch
Findings
OpenResearch is a nonprofit research lab that seeks to answer open-ended questions.
โค2
Enterprise AI Focused startup Cohere raised a $5.5 Bn round
Highlights:
- Revenue of $35 MM ARR, up from $13 MM ARR end of 2023
- $5.5 Bn valuation implies a 150x (ish) price to sales multiple. 2x valuation from last year
Investors:
The company has raised $500 million in a Series D funding, it plans to announce on Monday.
The round was led by Canadian pension investment manager PSP Investments, alongside a syndicate of additional new backers including investors at Cisco Systems Inc., Japanโs Fujitsu, chipmaker Advanced Micro Devices Inc.โs AMD Ventures and Canadaโs export credit agency EDC.
Customers: Cohere has customers across a wide range of industries.
They include banks, tech companies and retailers.
One luxury consumer brand is using a virtual shopping tool Cohere built to help workers suggest products to customers. Toronto-Dominion Bank, a new customer, will use Cohereโs AI for tasks such as answering questions based on financial documents
Sourcing:
Cohereโs models can be used across 10 languages, including English, Spanish, Chinese, Arabic and Japanese, and its models can cite sources in answers.
Highlights:
- Revenue of $35 MM ARR, up from $13 MM ARR end of 2023
- $5.5 Bn valuation implies a 150x (ish) price to sales multiple. 2x valuation from last year
Investors:
The company has raised $500 million in a Series D funding, it plans to announce on Monday.
The round was led by Canadian pension investment manager PSP Investments, alongside a syndicate of additional new backers including investors at Cisco Systems Inc., Japanโs Fujitsu, chipmaker Advanced Micro Devices Inc.โs AMD Ventures and Canadaโs export credit agency EDC.
Customers: Cohere has customers across a wide range of industries.
They include banks, tech companies and retailers.
One luxury consumer brand is using a virtual shopping tool Cohere built to help workers suggest products to customers. Toronto-Dominion Bank, a new customer, will use Cohereโs AI for tasks such as answering questions based on financial documents
Sourcing:
Cohereโs models can be used across 10 languages, including English, Spanish, Chinese, Arabic and Japanese, and its models can cite sources in answers.
Bloomberg.com
AI Startup Cohere Valued at $5.5 Billion in New Funding Round
The Canadian AI unicorn doesn't have a viral chatbot, but itโs signed on hundreds of corporate clients.
Apple presents SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
Comparable or better performance compared to SotA Video LLMs that are fine-tuned on video datasets while being training-free.
Comparable or better performance compared to SotA Video LLMs that are fine-tuned on video datasets while being training-free.
arXiv.org
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large...
We propose SlowFast-LLaVA (or SF-LLaVA for short), a training-free video large language model (LLM) that can jointly capture detailed spatial semantics and long-range temporal context without...
Chinese team developed a fabrication method to produce a semiconductor material just 0.7 nanometres thick.
A team led by Liu Kaihui of Peking University, Liu Can of Renmin University, and Zhang Guangyu of the Institute of Physics at the Chinese Academy of Sciences developed a fabrication method to produce a semiconductor material just 0.7 nanometres thick.
The researchersโ findings, which were published in the peer-reviewed journal Science on July 5, address a key barrier to reducing the size of traditional silicon-based chips โ as devices shrink, silicon chips run into physical limits that affect their performance.
The scientists explored two-dimensional (2D) transition-metal dichalcogenides (TMDs) as an alternative to silicon, with a thickness of just 0.7 nanometres compared to siliconโs typical 5-10 nanometres.
TMDs also consume less power and have superior electron transport properties, making them ideal for the ultra-scaled down transistors that will be a feature of next-generation electronic and photonic chips.
However, producing TMDs has been challenging โ until now. According to the paper, the technique developed by the scientists allows them to quickly produce high-quality 2D crystals in seven formulations, making mass production feasible.
The traditional fabrication process, which involves layer-by-layer assembly of atoms on a substrate โ like building a wall with bricks โ often results in crystals with insufficient purity, Liu Kaihui told state news agency Xinhua.
โThis is due to uncontrollable atomic arrangements in crystal growth and the accumulation of impurities and defects,โ he said.
The team arranged the first layer of atoms on the substrate as if they were following the traditional process. However, subsequent atoms were added between the substrate and the first crystal layer, pushing upwards like bamboo shoots to form new layers.
A team led by Liu Kaihui of Peking University, Liu Can of Renmin University, and Zhang Guangyu of the Institute of Physics at the Chinese Academy of Sciences developed a fabrication method to produce a semiconductor material just 0.7 nanometres thick.
The researchersโ findings, which were published in the peer-reviewed journal Science on July 5, address a key barrier to reducing the size of traditional silicon-based chips โ as devices shrink, silicon chips run into physical limits that affect their performance.
The scientists explored two-dimensional (2D) transition-metal dichalcogenides (TMDs) as an alternative to silicon, with a thickness of just 0.7 nanometres compared to siliconโs typical 5-10 nanometres.
TMDs also consume less power and have superior electron transport properties, making them ideal for the ultra-scaled down transistors that will be a feature of next-generation electronic and photonic chips.
However, producing TMDs has been challenging โ until now. According to the paper, the technique developed by the scientists allows them to quickly produce high-quality 2D crystals in seven formulations, making mass production feasible.
The traditional fabrication process, which involves layer-by-layer assembly of atoms on a substrate โ like building a wall with bricks โ often results in crystals with insufficient purity, Liu Kaihui told state news agency Xinhua.
โThis is due to uncontrollable atomic arrangements in crystal growth and the accumulation of impurities and defects,โ he said.
The team arranged the first layer of atoms on the substrate as if they were following the traditional process. However, subsequent atoms were added between the substrate and the first crystal layer, pushing upwards like bamboo shoots to form new layers.
South China Morning Post
Have Chinese scientists cracked code to making ultra-thin semiconductor material?
Research by Chinese team addresses key barrier to reducing the size of traditional silicon-based chips.
Can an organism understand the code it is programmed in? Humans are getting close to this with new Generative AI models trained directly on biological data.
Anyone reading this post is programmed by the biological code - DNA, RNA, and Proteins.
With LLMs now being trained directly on biological code, we are rapidly moving towards empowering ourselves, as a species, with the toolset to decipher our own programming language better.
So, how exactly are LLMs trained directly on biological data? Let's take protein data as an example, but the same paradigm applies to DNA or RNA.
This is a sneak peek into work at Converge Bio. Here are the five steps:
1. ๐๐๐๐ฒ๐บ๐ฏ๐น๐ฒ ๐ฎ ๐บ๐ฎ๐๐๐ถ๐๐ฒ ๐ฝ๐ฟ๐ผ๐๐ฒ๐ถ๐ป ๐๐ฒ๐พ๐๐ฒ๐ป๐ฐ๐ฒ ๐ฑ๐ฎ๐๐ฎ๐ฏ๐ฎ๐๐ฒ: With genome sequencing becoming cheaper every year, we now have billions of publicly available protein sequences for building these databases.
2. ๐ง๐ผ๐ธ๐ฒ๐ป๐ถ๐๐ฒ ๐๐ต๐ฒ ๐ฝ๐ฟ๐ผ๐๐ฒ๐ถ๐ป ๐ฑ๐ฎ๐๐ฎ๐ฏ๐ฎ๐๐ฒ: In this step, they build a dictionary of the "words" of the protein language. These words are called tokens and they are the atomic elements that LLMs learn from. There is a huge amount of research we and others are doing on how to best divide the protein language into tokens.
3. ๐จ๐ป๐๐๐ฝ๐ฒ๐ฟ๐๐ถ๐๐ฒ๐ฑ ๐๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ถ๐๐ฒ ๐ฃ๐ฟ๐ฒ-๐ง๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด (๐๐ฃ๐ง): Train a transformer-based model by hiding part of the sequence and training the model to fill in the missing tokens. They now have a model that deeply understands the statistical distribution of information in our massive database. At this stage, the trained model is often referred to as a foundational model.
4. ๐ฆ๐๐ฝ๐ฒ๐ฟ๐๐ถ๐๐ฒ๐ฑ ๐บ๐๐น๐๐ถ-๐๐ฎ๐๐ธ ๐๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด. They now train the foundational model to understand not only the statistical distribution of information in the data but also how that information translates into real-world biological traits. This is done by training the model on any labeled trusted dataset you can get your hands on that connects a protein sequence with a biological outcome. In the protein context, here are a few examples - Protein structures, Protein annotations, and binding affinity experimental data. Research in AI shows that by using this paradigm, the model becomes "smarter" when introduced to multiple diverse tasks (similar to a child learning new skills).
5. ๐๐ถ๐ป๐ฒ ๐๐๐ป๐ถ๐ป๐ด: Given a new biological question, you can now fine-tune the model with a relatively small dataset and use it to predict complex biological interactions, explain the model's decision in a protein sequence context, and generate novel and better-performing proteins.
Anyone reading this post is programmed by the biological code - DNA, RNA, and Proteins.
With LLMs now being trained directly on biological code, we are rapidly moving towards empowering ourselves, as a species, with the toolset to decipher our own programming language better.
So, how exactly are LLMs trained directly on biological data? Let's take protein data as an example, but the same paradigm applies to DNA or RNA.
This is a sneak peek into work at Converge Bio. Here are the five steps:
1. ๐๐๐๐ฒ๐บ๐ฏ๐น๐ฒ ๐ฎ ๐บ๐ฎ๐๐๐ถ๐๐ฒ ๐ฝ๐ฟ๐ผ๐๐ฒ๐ถ๐ป ๐๐ฒ๐พ๐๐ฒ๐ป๐ฐ๐ฒ ๐ฑ๐ฎ๐๐ฎ๐ฏ๐ฎ๐๐ฒ: With genome sequencing becoming cheaper every year, we now have billions of publicly available protein sequences for building these databases.
2. ๐ง๐ผ๐ธ๐ฒ๐ป๐ถ๐๐ฒ ๐๐ต๐ฒ ๐ฝ๐ฟ๐ผ๐๐ฒ๐ถ๐ป ๐ฑ๐ฎ๐๐ฎ๐ฏ๐ฎ๐๐ฒ: In this step, they build a dictionary of the "words" of the protein language. These words are called tokens and they are the atomic elements that LLMs learn from. There is a huge amount of research we and others are doing on how to best divide the protein language into tokens.
3. ๐จ๐ป๐๐๐ฝ๐ฒ๐ฟ๐๐ถ๐๐ฒ๐ฑ ๐๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ถ๐๐ฒ ๐ฃ๐ฟ๐ฒ-๐ง๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด (๐๐ฃ๐ง): Train a transformer-based model by hiding part of the sequence and training the model to fill in the missing tokens. They now have a model that deeply understands the statistical distribution of information in our massive database. At this stage, the trained model is often referred to as a foundational model.
4. ๐ฆ๐๐ฝ๐ฒ๐ฟ๐๐ถ๐๐ฒ๐ฑ ๐บ๐๐น๐๐ถ-๐๐ฎ๐๐ธ ๐๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด. They now train the foundational model to understand not only the statistical distribution of information in the data but also how that information translates into real-world biological traits. This is done by training the model on any labeled trusted dataset you can get your hands on that connects a protein sequence with a biological outcome. In the protein context, here are a few examples - Protein structures, Protein annotations, and binding affinity experimental data. Research in AI shows that by using this paradigm, the model becomes "smarter" when introduced to multiple diverse tasks (similar to a child learning new skills).
5. ๐๐ถ๐ป๐ฒ ๐๐๐ป๐ถ๐ป๐ด: Given a new biological question, you can now fine-tune the model with a relatively small dataset and use it to predict complex biological interactions, explain the model's decision in a protein sequence context, and generate novel and better-performing proteins.