Axis of Ordinary

Letting LLM recursively criticize and improve its output significantly outperforms existing LLM methods on computer tasks and surpasses supervised learning (SL) and RL approaches.

"In this work, we show that a pre-trained large language model (LLM) agent can execute computer tasks guided by natural language using a simple prompting scheme where the agent recursively criticizes and improves its output (RCI)."

Paper: https://arxiv.org/abs/2303.17491

489 views13:35

Axis of Ordinary

0:14

This media is not supported in your browser

VIEW IN TELEGRAM

"Today, we're sharing two major advancements in our work toward general-purpose embodied AI agents: VC-1 & ASC. We're excited for how this work will help build toward a future where AI agents can assist humans in both the virtual & physical world."

https://ai.facebook.com/blog/robots-learning-video-simulation-artificial-visual-cortex-vc-1/

684 views14:16

Axis of Ordinary

0:44

This media is not supported in your browser

VIEW IN TELEGRAM

Hinton: “That's an issue, right. We have to think hard about how to control that.“

Reporter: “Can we?“

Hinton: “We don't know. We haven't been there yet. But we can try!“

Reporter: “That seems kind of concerning?“

Hinton: “Uh, yes!“

Source: https://www.cbsnews.com/video/godfather-of-artificial-intelligence-talks-impact-and-potential-of-new-ai/

👍2

618 viewsedited 14:55

Axis of Ordinary

https://twitter.com/sama/status/1641827944980168709

🔥5

469 views17:08

Axis of Ordinary

Links for 2023-04-01

1. “Given the enormous number of instructional videos available online, learning a diverse array of multi-step task models from videos is an appealing goal. We introduce a new pre-trained video model, VideoTaskformer, focused on representing the semantics and structure of instructional videos.” https://medhini.github.io/task_structure/

2. “here’s a force-directed knowledge graph interface for @OpenAI’s gpt-4. given a topic, it prompts new questions to ask based on its own generated responses, allowing curiosity-led exploration of a concept.” https://twitter.com/hturan/status/1641780868640374784

3. TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs https://arxiv.org/abs/2303.16434

4. ReBotNet: Fast Real-time Video Enhancement https://jeya-maria-jose.github.io/rebotnet-web/

5. “Chatbase lets you create a custom ChatGPT from your data, customize its UI, and embed it on your website as a chat bubble or an iframe!” https://twitter.com/yasser_elsaid_/status/1640391858143608843

6. Incorporating AI to query your SQL database https://twitter.com/JonZLuo/status/1638638298666004483

7. “Rather than discarding a previous version of a model, Kim and his collaborators use it as the building blocks for a new model. Using machine learning, their method learns to “grow” a larger model from a smaller model in a way that encodes knowledge the smaller model has already gained. This enables faster training of the larger model.” https://news.mit.edu/2023/new-technique-machine-learning-models-0322

8. "Former UK government advisor: We’re giving away AGI to the private sector. Why? https://jameswphillips.substack.com/p/securing-liberal-democratic-control

9. “AI takeover is very likely: Humans compete with each other, but that doesn't "keep us in check" from the perspective of chimpanzees. Our concepts are too advanced for them. They lack the ability to participate in our economy. The world is now a human world, not a chimp world.” https://twitter.com/JeffLadish/status/1639194473350717442

10. A GPT-4-based worm https://github.com/refcell/run-wild

11. “i have been told that gpt5 is scheduled to complete training this december and that openai expects it to achieve agi.” https://www.nextbigfuture.com/2023/03/agi-level-gpt5-in-nine-months.html

12. OA apparently is denying they are 'currently training GPT-5': “Hannah Wong, a spokesperson for OpenAI, says the company spent more than six months working on the safety and alignment of GPT-4 after training the model. She adds that OpenAI is not currently training GPT-5.” https://www.wired.com/story/chatgpt-pause-ai-experiments-open-letter/

13. With GPT-4, OpenAI Is Deliberately Slow Walking To AGI https://www.piratewires.com/p/openai-slowing-walking-gpt

14. What can we learn from hunter-gatherers about children's mental health? An evolutionary perspective https://acamh.onlinelibrary.wiley.com/doi/10.1111/jcpp.13773

👍2🥱1

1.3K views07:32

Axis of Ordinary

(Something I wrote in 2013.)

A rarely mentioned side effect of superhuman artificial general intelligence is that, even if it doesn't kill us, it will remove almost all meaning. Everything anyone cares about will be literally one prayer away from being fulfilled.

For example, what if you came up with a philosophical conundrum? Well, just ask God to solve it for you. And if you're not smart enough to understand the solution, just ask God to make you smart enough.

Or what if you wanted to do mathematics? You could trivially integrate the resources of a specialized Matrioshka brain into your consciousness and implement and run an ideal mathematician.

Everything you could do has either already been done or can be done better by God.

But surely, you wonder, there must be fantastic virtual environments to explore. And what about sex? Well, God thoroughly understands what it is that makes exploration and sex fun for humans. It knows how to implement the ideal adventure in which you save people of maximal sexual attractiveness and could instantly integrate the memory of such an adventure for you, or simulate it a billion times in a few nanoseconds. And the same is true for all possible permutations that are less desirable.

But the consequences are even deeper than that. Concepts such as creativity or fun will be perfectly understood as mechanical procedures that God can easily implement and maximize. For God, human happiness is conceptually no more interesting than an involuntary muscle contraction. For God, the incomprehensible complexity of your human values is a conceptually simplistic, and transparent set of rules, barely more intriguing than a rat pressing a lever to receive a short electric stimulation of its reward center.

In summary, artificial general intelligence is literally the last discovery we have to make. At that point, the universe has understood itself.

The movie has been watched.

The game has been won.

The end.

👍13🤔4😢2🔥1🥰1🤮1😐1

615 views12:43

Axis of Ordinary

Links for 2023-04-02

1. “Strong problem-solving systems can be built from AI systems that play diverse roles, LLMs can readily play diverse roles in role architectures, and AI systems based on role architectures can be practical, safe, and effective in undertaking complex and consequential tasks.” https://www.lesswrong.com/posts/AKaf8zN2neXQEvLit/role-architectures-applying-llms-to-consequential-tasks

2. Five years of progress in GPTs: A summary of the progression of the SOTA in language models https://finbarrtimbers.substack.com/p/five-years-of-progress-in-gpts

3. Learning from human-written natural language feedback is both more effective and sample-efficient than training exclusively on demonstrations of code generation tasks. https://arxiv.org/abs/2303.16749

4. ChatGPT for threat analysis of your code: “In just 2 days, we confirmed 227 vulnerable and malware packages, all discovered with the help of ChatGPT” https://twitter.com/feross/status/1641548124366987264

5. GPT is becoming a Turing machine: Here are some ways to program it https://arxiv.org/abs/2303.14310

6. “Our model achieves the SOTA on image-text and text-image retrieval, video question answering and open-vocabulary detection tasks, outperforming much larger and more extensively trained foundational models.” https://arxiv.org/abs/2303.16839

7. “We observe that scaling Vision Transformers increases [out-of-distribution] performance: even though ImageNet accuracy saturates, we see a significant increase on ObjectNet from ViT-e to ViT-22B…” https://ai.googleblog.com/2023/03/scaling-vision-transformers-to-22.html

8. Detecting novel systemic biomarkers in external eye photos https://ai.googleblog.com/2023/03/detecting-novel-systemic-biomarkers-in.html

9. “Micro-reactors are quite the popular topic right now, so let's talk about how you make a REALLY micro-reactor using the best (thermal) nuclear fuel we know of, Americium! Specifically, the isotope Am-24m.” https://twitter.com/GBruhaug/status/1638998500770992130

10. How your brain data could be used against you [MIT Technology Review] https://archive.is/xb7Pc (Related science fiction short stories (highly recommended): 1. https://qntm.org/mmacevedo 2. https://zerohplovecraft.substack.com/p/key-performance-indicators)

11. ZeFrank on predatory mussels and their mimicry: “This is one of my all-time favorite examples of the power of natural selection, and one I taught in my evolution class as an example of mimicry.” https://whyevolutionistrue.com/2023/03/23/zefrank-on-predatory-mussels/

12. Don’t panic about social media harming your child’s mental health – the evidence is weak https://inews.co.uk/news/technology/dont-panic-about-social-media-harming-your-childs-mental-health-the-evidence-is-weak-2230571

13. Tankers of 54th Mechanized Brigade attacking Russians near Verkhnokamyanske, March 31 https://www.youtube.com/watch?v=0EvE6KMmJ70

👍2

600 views07:26

Axis of Ordinary

Links for 2023-04-03

1. DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents — “It provides a simple, interpretable forum for models to communicate feedback and iteratively improve output. We frame our dialog as a discussion between two agent types - a Researcher, who processes information and identifies crucial problem components, and a Decider, who has the autonomy to integrate the Researcher's information and makes judgments on the final output…shows significant improvement over the base GPT-4 performance in both human expert preference evaluations and quantitative metrics.” https://arxiv.org/abs/2303.17071

2. GPT 4 Can Improve Itself - (ft. Reflexion, HuggingGPT, Bard Upgrade and much more) https://youtu.be/5SgJKZLBrmg /

3. Auto-GPT can recursively debug code. Andrej Karpathy: “Next frontier of prompt engineering imo: "AutoGPTs" . 1 GPT call is just like 1 instruction on a computer. They can be strung together into programs. Use prompt to define I/O device and tool specs, define the cognitive loop, page data in and out of context window, .run().” https://twitter.com/karpathy/status/1642598890573819905

4. “…we find interesting evidence that simple sequence prediction can lead to the formation of a world model…Our experiment provides evidence supporting that these language models are developing world models and relying on the world model to generate sequences.” https://thegradient.pub/othello/

5. “Actually, Othello-GPT Has A Linear Emergent World Representation ... "predict the next token" transformer models are capable of learning a model of the world.” https://www.lesswrong.com/posts/nmxzr2zsjNtjaHh7x/actually-othello-gpt-has-a-linear-emergent-world

6. Modern language models refute Chomsky’s approach to language https://lingbuzz.net/lingbuzz/007180

7. Prediction market: Conditional on an okay outcome with AGI, how did that happen? https://www.lesswrong.com/posts/uNepkB5EqETC8b9C2/manifold-if-okay-agi-why

8. The most comprehensive database on DARPA neurotech projects, program managers, and outcomes (probably). Plus, a case study. https://cell.substack.com/p/darpa-neurotech

9. Size matters: Bigger brains equal more complex hand movements in primates. Big brains come at a big cost, however. https://bigthink.com/life/primate-brain-size/

10. Modern neuroscience confirms race differences in brain size and functioning https://kirkegaard.substack.com/p/modern-neuroscience-confirms-race (James Thompson: Racial differences are brain deep https://thompsonj.substack.com/p/not-unreasonable)

11. Meta-analysis of quasi-experimental studies on how child abuse impacts mental health finds that about 45% of the correlation normally observed is non-causal. The remaining effect is not large and is similar for physical, sexual, and emotional abuse. (via @Sean__Last) https://gwern.net/doc/psychiatry/2023-baldwin.pdf

12. 60 Minutes explores advancements in artificial prosthetics technology that can now restore a sense of touch. https://www.cbsnews.com/news/advancements-in-prosthetics-limb-technology-allow-feeling-control-60-minutes-transcript-2023-03-26/

13. Thread about Richard III: “And whether Richard III was a true heir or fraud, the big picture's the same: descended from Anatolians who became Romans, who in turn entered the barbarian elite and climbed to its top, 1000 miles away, in 1000 years.” https://twitter.com/paul_hundred/status/1639824849899069441

563 views07:35

Axis of Ordinary

Self-Refine: Iterative Refinement with Self-Feedback https://selfrefine.info

Presents a novel approach that allows LLMs to iteratively refine outputs and incorporate feedback along multiple dimensions to improve performance on diverse tasks.

460 views09:48

Axis of Ordinary

Eight Things to Know about Large Language Models

https://cims.nyu.edu/~sbowman/eightthings.pdf

😨6

442 views10:08

Axis of Ordinary

https://twitter.com/Altimor/status/1642640919366348800

😁15🥴1

491 views10:36

Axis of Ordinary

0:29

This media is not supported in your browser

VIEW IN TELEGRAM

"Witnessed and recorded the most INSANE car crash yesterday, you can see Autopilot also swerve and avoid the rouge tire for me $TSLA"

https://twitter.com/Anoop_Khatra/status/1639460487166586881

🔥6😨2

467 views17:35

Axis of Ordinary

Links for 2023-04-04

1. CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society https://www.camel-ai.org/

2. Self-healing code GitHub action pipeline using @LangChainAI and @OpenAI. https://twitter.com/calvinhoenes/status/1642441789033578498 (code: https://github.com/xpluscal/selfhealing-action-express)

3. LLM Modularity: The Separability of Capabilities in Large Language Models https://www.lesswrong.com/posts/j84JhErNezMxyK4dH/llm-modularity-the-separability-of-capabilities-in-large

4. "10-word quote": a short and simple failure mode of ChatGPT https://www.reddit.com/r/slatestarcodex/comments/1201v68/10word_quote_a_short_and_simple_failure_mode_of/

5. Language Models Can Predict Public Opinion https://arxiv.org/abs/2303.16779

6. Can ChatGPT Decipher Fedspeak? Yes! https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4399406

7. “ChatGPT is Inform8: interactive fiction with dual narration and action planes (both playable), dialogues, inner thoughts and the ability to argue about narration to change the course of action.” https://old.reddit.com/r/interactivefiction/comments/11z6p84/chatgpt_is_inform8_interactive_fiction_with_dual/

8. GPT-4 for “just-in-time” UI generation https://twitter.com/mlejva/status/1641151421830529042

9. York student uses AI chatbot to get parking fine revoked https://www.bbc.com/news/uk-england-york-north-yorkshire-65126772

10. By Cracking a Metal 3D-Printing Conundrum, Researchers Propel the Technology Toward Widespread Application https://www.nist.gov/news-events/news/2023/03/cracking-metal-3d-printing-conundrum-researchers-propel-technology-toward

11. Accelerating genetic design: Mixing and matching sequencing tools to build biological circuits faster https://centuryofbio.substack.com/p/accelerating-genetic-design

12. Sensory strength distinguishes imagination from reality https://www.nature.com/articles/s41467-023-37322-1

13. 51% of American adults favored conducting “a major research program” into solar geoengineering when it was explained to them. 67% supported deploying it if the government found it to be “inexpensive, effective, and low-risk.” https://archive.is/b1LSn

589 views07:15

Axis of Ordinary

Derek Parfit in his book Reasons and Persons:

“Compare three outcomes:

1. Peace
2. A nuclear war that kills 99% of the world’s existing population.
3. A nuclear war that kills 100%.

(2) would be worse than (1), and (3) would be worse than (2).

Which is the greater of these two differences?

Most people believe that the greater difference is between (1) and (2). I believe that the difference between (2) and (3) is very much greater.”

Extinction would be much worse because it prevents the existence of all future generations.

👍6👎3

445 views08:30

Axis of Ordinary

On the day that formerly neutral Finland joins NATO, let us remember once again that this would not have been possible without the strategic genius of Vladimir Vladimirovich Putin.

His achievements are manifold:

1. No more talk about cutting defense spending among European nations. On the contrary, everyone is talking about modernization and increased funding.

2. The Western military industry has received a massive boost. Poland's $10 billion order for HIMARS rocket launchers and ammunition is just the beginning, as Russia is increasingly unable to meet its arms delivery commitments.

3. Ukraine is more united than ever and its self-identity has been massively strengthened. The alleged threat posed by Ukraine has not been neutralized. On the contrary, Ukraine is now capable of launching frequent attacks inside Russian territory.

4. Russia has been severely weakened. Russia has lost most of its considerable soft power over Western nations and access to Western markets. The image of Russia as a superpower has been completely shattered. And without Western technology and with China as its main export market, Russia's future will be that of a Chinese vassal state. A decidedly worse prospect for its national identity than before.

👍14🥴2😱1🤮1

514 views14:12

Axis of Ordinary

“It takes around 6 to 8 years to build a nuclear reactor. That’s the average construction time globally. Reactors can be built very quickly: some have been built in just 3 to 5 years.”

https://hannahritchie.substack.com/p/nuclear-construction-time

✍7😢2🤔1

512 views15:43

Axis of Ordinary

Links for 2023-04-05

1. “There have been an incredible number of gene-editing advances (including in delivery) over the last three days. Here are 7 of them...” https://twitter.com/NikoMcCarty/status/1641811586431213568

2. Eliezer Yudkowsky: “So the actual scary part to me is that GPT4 understands what it means to say, "Compress this in a way where *you* can decompress it."…("Stochastic parrot" my fucking ass.)” https://twitter.com/ESYudkowsky/status/1643428537821720578 (for updates on this see e.g. this tweet and others by the same author: https://twitter.com/gfodor/status/1643415357615640577)

3. SudoLang: A Powerful Pseudocode Programming Language for LLMs https://medium.com/javascript-scene/sudolang-a-powerful-pseudocode-programming-language-for-llms-d64d42aa719b

4. Build an entire data workflow with just a one-sentence prompt https://twitter.com/EinblickAI/status/1641518278668582912

5. “What is the chance your vote decides the whole election? I've found an elegant proof that unless the election is a forgone conclusion, the chance can't be much lower than 1 in the number of voters.” http://www.tobyord.com/writing/decisive-vote

6. Why Starting A Rocket Engine Is So Hard! https://www.youtube.com/watch?v=bAUVCn_jw5I

7. difflogic - A Library for Differentiable Logic Gate Networks https://github.com/Felix-Petersen/difflogic

8. Closed-form parametric equations for plain-knit yarns and the twisted fibers running around them. Also includes C code to generate curves, and displacement/alpha maps for making tiled patterns. http://www.cs.cmu.edu/~kmcrane/Projects/Other/YarnCurve.pdf

9. “The normalization scheme that DeepMind researchers came up with for their "linear recurrent unit" (LRU) is a nice example of how it is possible to predictably engineer circuits in artificial neural networks, when you know what you're doing.” https://twitter.com/CFGeek/status/1640445451387412481

10. Floating Nuclear Power Buoyant on New Prospects https://www.powermag.com/floating-nuclear-power-buoyant-on-new-prospects/

11. “China spent $240 billion bailing out 22 developing countries between 2008 and 2021, with the amount soaring in recent years as more have struggled to repay loans spent building ‘Belt & Road’ infrastructure” [Reuters] https://archive.is/dpRuE

12. "Russia’s economy is in the tank and the reason many of us have thought differently is that Putin literally makes up the numbers." https://www.econlib.org/jeff-sonnenfelds-bombshell-about-the-russian-economy/

13. Satellite images: A web of trenches shows Russia fears losing Crimea https://web.archive.org/web/20230405012231/https://www.washingtonpost.com/world/interactive/2023/ukraine-russia-crimea-battle-trenches/

14. How the UAE paid a Swiss firm to destroy an oil-trading firm using negative PR, in part by editing Wikipedia articles about it [The New Yorker] https://archive.is/v3L8t

657 viewsedited 07:47

Axis of Ordinary

0:23

This media is not supported in your browser

VIEW IN TELEGRAM

"Introducing DribbleBot: A robot that can dribble a soccer ball on diverse natural terrains. Be it snow, be it grass, be it pavement, or be it sand, it keeps going! If the robot falls, it automatically recovers and keeps dribbling."

https://gmargo11.github.io/dribblebot/

472 views10:32

Axis of Ordinary

This media is not supported in your browser

VIEW IN TELEGRAM

Meta AI: "Today we're releasing the Segment Anything Model (SAM) — SAM is capable of one-click segmentation of any object from any photo or video + zero-shot transfer to other segmentation tasks"

https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/

🤯3

523 views14:41

Axis of Ordinary

The beginning of an AGI arms race?

https://www.chinatalk.media/p/ai-proposals-at-two-sessions-agi

😨7🥱3🙏1

654 views20:22

Axis of Ordinary

Links for 2023-04-06

1. TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings — Much cheaper, lower power, and faster than Infiniband, OCSes and underlying optical components are <5% of system cost and <3% of system power. TPU v4 outperforms TPU v3 by 2.1x, and The TPU v4 pod is 4x larger at 4096 chips and thus ~10x faster overall. https://arxiv.org/abs/2304.01433

2. Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data -- Proposes a pipeline that can automatically generate a high-quality multi-turn chat corpus by leveraging ChatGPT to engage in a conversation with itself. https://arxiv.org/abs/2304.01196

3. “REFINER, a framework for finetuning LMs to explicitly generate intermediate reasoning steps while interacting with a critic model that provides automated feedback on the reasoning. Specifically, the critic provides structured feedback that the reasoning LM uses to iteratively improve its intermediate arguments.” https://arxiv.org/abs/2304.01904

4. “…we are releasing Cerebras-GPT, a family of 7 GPT models from 111M to 13B parameters trained using the Chinchilla formula. These are the highest accuracy models for a compute budget and are available today open-source!” https://twitter.com/CerebrasSystems/status/1640725880711569408

5. “We present Cluster-Branch-Train-Merge (c-BTM), a new way to scale sparse expert LLMs on any dataset — completely asynchronously.” https://twitter.com/ssgrn/status/1640322362100051968

6. unarXive 2022: All arXiv Publications Pre-Processed for NLP, Including Structured Full-Text and Citation Network https://arxiv.org/abs/2303.14957

7. Robotic hand can identify objects with just one grasp https://news.mit.edu/2023/robotic-hand-can-identify-objects-just-one-grasp-0403

8. To Make Self-Driving Cars Safer, Expose Them to Terrible Drivers https://singularityhub.com/2023/03/31/to-make-self-driving-cars-safer-expose-them-to-terrible-drivers/

9. “I’m scared of AGI. It's confusing how people can be so dismissive of the risks. I’m an investor in two AGI companies and friends with dozens of researchers working at DeepMind, OpenAI, Anthropic, and Google Brain. Almost all of them are worried.” https://twitter.com/arram/status/1642614341622181889

10. IQ estimates of public intellectuals and personas https://kirkegaard.substack.com/p/iq-estimates-of-public-intellectuals

11. Calling the Lab-Leak Theory 'Disinformation' Created Disinformation [The New York Times] https://archive.is/6YLW8

12. Do Politicians Ignore NIH Ties with Wuhan Lab to Protect Biodefense "Contract Racket"? https://disinformationchronicle.substack.com/p/do-politicians-ignore-nih-ties-with

👍4

540 views07:14

About

Blog

Apps

Platform